v5.6
Dataverse Software 5.6
This release brings new features, enhancements, and bug fixes to the Dataverse Software. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project.
Release Highlights
Anonymized Access in Support of Double Blind Review
Dataverse installations can select whether or not to allow users to create anonymized private URLs and can control which specific identifying fields are anonymized. If this is enabled, users can create private URLs that do not expose identifying information about dataset depositors, allowing for double blind reviews of datasets in the Dataverse installation.
Guestbook Responses API
A new API to retrieve Guestbook responses has been added. This makes it easier to retrieve the records for large guestbooks and also makes it easier to integrate with external systems.
Dataset Semantic API (Experimental)
Dataset metadata can be retrieved, set, and updated using a new, flatter JSON-LD format - following the format of an OAI-ORE export (RDA-conformant Bags), allowing for easier transfer of metadata to/from other systems (i.e. without needing to know Dataverse's metadata block and field storage architecture). This new API also allows for the update of terms metadata (#5899).
This development was supported by the Research Data Alliance, DANS, and Sciences PO and follows the recommendations from the Research Data Repository Interoperability Working Group.
Dataset Migration API (Experimental)
Datasets can now imported following the format of an OAI-ORE export (RDA-conformant Bags), allowing for easier migration from one Dataverse installation to another, and migration from other systems. This experimental, superuser only, endpoint also allows keeping the existing persistent identifier (where the authority and shoulder match those for which the software is configured) and publication dates.
This development was supported by DANS and the Research Data Alliance and follows the recommendations from the Research Data Repository Interoperability Working Group.
Direct Upload API Now Available for Adding Multiple Files' Metadata to the Dataset
Using the Direct Upload API, users can now add metadata of multiple files to the dataset after the files exist in the S3 bucket. This makes direct uploads more efficient and reduces server load by only updating the dataset once instead of once per file. For more information, see the Direct DataFile Upload/Replace API section of the Dataverse Software Guides.
Major Use Cases
Newly-supported major use cases in this release include:
- Users can create Private URLs that anonymize dataset metadata, allowing for double blind peer review. (Issue #1724, PR #7908)
- Users can download Guestbook records using a new API. (Issue #7767, PR #7931)
- Users can update terms metadata using the new semantic API. (Issue #5899, PR #7414)
- Users can retrieve, set, and update metadata using a new, flatter JSON-LD format. (Issue #6497, PR #7414)
- Administrators can use the Traces API to retrieve information about specific types of user activity (Issue #7952, PR #7953)
Notes for Dataverse Installation Administrators
New Database Constraint
A new DB Constraint has been added in this release. Full instructions on how to identify whether or not your database needs any cleanup before the upgrade can be found in the Dataverse software GitHub repository. This information was also emailed out to Dataverse installation contacts.
Payara 5.2021.5 (or Higher) Required
Some changes in this release require an upgrade to Payara 5.2021.5 or higher. (See the upgrade section).
Instructions on how to update can be found in the Payara documentation We've included the necessary steps below, but we recommend that you review the Payara upgrade instructions as it could be helpful during any troubleshooting.
Installations upgrading from a previous Payara version shouldn't encounter a logging configuration bug in Payara-5.2021.5, but if your server.log fills with repeated notes about logging configuration and WELD complaints about loading beans, see the paragraph on logging.properties
in the Installation Guide
Enhancement to DDI Metadata Exports
To increase support for internationalization and to improve compliance with CESSDA requirements, DDI exports now have a holdings element with a URI attribute whose value is the URL form of the dataset PID.
New JVM Options and DB Settings
:AnonymizedFieldTypeNames can be used to enable creation of anonymized Private URLs and to specify which fields will be anonymized.
Notes for Tool Developers and Integrators
Semantic API
The new Semantic API is especially helpful in data migrations and getting metadata into a Dataverse installation. Learn more in the Developers Guide.
Complete List of Changes
For the complete list of code changes in this release, see the 5.6 Milestone in Github.
For help with upgrading, installing, or general questions please post to the Dataverse Community Google Group or email [email protected].
Installation
If this is a new installation, please see our Installation Guide.
Upgrade Instructions
0. These instructions assume that you've already successfully upgraded from Dataverse Software 4.x to Dataverse Software 5 following the instructions in the Dataverse Software 5 Release Notes. After upgrading from the 4.x series to 5.0, you should progress through the other 5.x releases before attempting the upgrade to 5.6.
The steps below include a required upgrade to Payara 5.2021.5 or higher. (It is a simple matter of reusing your existing domain directory with the new distribution). But we also recommend that you review the Payara upgrade instructions as it could be helpful during any troubleshooting: Payara documentation
If you are running Payara as a non-root user (and you should be!), remember not to execute the commands below as root. Use sudo
to change to that user first. For example, sudo -i -u dataverse
if dataverse
is your dedicated application user.
In the following commands we assume that Payara 5 is installed in /usr/local/payara5
. If not, adjust as needed.
export PAYARA=/usr/local/payara5
(or setenv PAYARA /usr/local/payara5
if you are using a csh
-like shell)
1. Undeploy the previous version
$PAYARA/bin/asadmin list-applications
$PAYARA/bin/asadmin undeploy dataverse<-version>
2. Stop Payara
service payara stop
rm -rf $PAYARA/glassfish/domains/domain1/generated
3. Move the current Payara directory out of the way
mv $PAYARA $PAYARA.MOVED
4. Download the new Payara version (5.2021.5+), and unzip it in its place
5. Replace the brand new payara/glassfish/domains/domain1 with your old, preserved domain1
6. Start Payara
service payara start
7. Deploy this version.
$PAYARA/bin/asadmin deploy dataverse-5.6.war
8. Restart payara
service payara stop
service payara start
10. Run ReExportall to update JSON Exports http://guides.dataverse.org/en/5.6/admin/metadataexport.html?highlight=export#batch-exports-through-the-api
Additional Release Steps
If your installation relies on the database-side stored procedure for generating sequential numeric identifiers:
Note that you can skip this step if your installation uses the default-style, randomly-generated six alphanumeric character-long identifiers for your datasets! This is the case with most Dataverse installations.
The underlying database framework has been modified in this release, to make it easier for installations to create custom procedures for generating identifier strings that suit their needs. Your current configuration will be automatically updated by the database upgrade (Flyway) script incorporated in the release. No manual configuration changes should be necessary. However, after the upgrade, we recommend that you confirm that your installation can still create new datasets, and that they are still assigned sequential numeric identifiers. In the unlikely chance that this is no longer working, please re-create the stored procedure following the steps described in the documentation for the :IdentifierGenerationStyle
setting in the Configuration section of the Installation Guide for this release (v5.6).
(Running the script supplied there will NOT overwrite the position on the sequence you are currently using!)