-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API for auditing physical files and file metadata #11016
base: develop
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a quick pass through the docs and code. @stevenwinship please let me know what you think.
Auditing specific Datasets (comma separated list):: | ||
curl "$SERVER_URL/api/admin/datafiles/auditFiles?DatasetIdentifierList=doi.org/10.5072/FK2/JXYBJS,doi.org/10.7910/DVN/MPU019 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use this pattern of passing in the URL form of a PID minus "https://" anywhere else? It seems ok. Can we pass in the normal PIDs (the non-URL form) instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9/21/22 Durbin:
Batch Exports Through the API
...
curl http://localhost:8080/api/admin/metadata/:persistentId/reExportDataset?persistentId=doi:10.5072/FK2/AAA000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's different... "doi.org/10... vs. doi:10...".
In this PR we. should use the pattern from reExportDataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the doc
"identifier": "DVN/MPU019", | ||
"persistentURL": "https://doi.org/10.7910/DVN/MPU019", | ||
"missingFiles": [ | ||
"s3://dvn-cloud:298910, jihad_metadata_edited.csv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same. Easier parsing would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-formatted the json output:
"missingFiles": [
{
"StorageIdentifier": "s3://dvn-cloud:298910",
"label": "jihad_metadata_edited.csv"
}
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Thanks. Do we need the directoryLabel
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added directoryLabel
Co-authored-by: Philip Durbin <[email protected]>
Co-authored-by: Philip Durbin <[email protected]>
Co-authored-by: Philip Durbin <[email protected]>
Co-authored-by: Philip Durbin <[email protected]>
Co-authored-by: Philip Durbin <[email protected]>
Co-authored-by: Philip Durbin <[email protected]>
Co-authored-by: Philip Durbin <[email protected]>
Co-authored-by: Philip Durbin <[email protected]>
This comment has been minimized.
This comment has been minimized.
4 similar comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
📦 Pushed preview images as
🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
What this PR does / why we need it: Find Datasets with missing files so Admins can either delete the file reference or work with authors to re-upload the files.
See: IQSS/dataverse.harvard.edu#220
Which issue(s) this PR closes:
Special notes for your reviewer:
Suggestions on how to test this: Create multiple Datasets with multiple files. If running in Docker locally delete a file from docker-dev-volumes/app/data/store...
call the api and see the missing file listed in the json response.
Other test could include deleting a FileMetadata row from the DB
Request specific Datasets as well as firstId and lastId
Does this PR introduce a user interface change? If mockups are available, please link/include them here: No
Is there a release notes update needed for this change?: Included
Additional documentation: