Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Send additional license information to DataCite #10883

Open
6 tasks
philippconzett opened this issue Sep 26, 2024 · 0 comments
Open
6 tasks
Labels
Size: 50 A percentage of a sprint. 35 hours. Type: Feature a feature request

Comments

@philippconzett
Copy link
Contributor

Overview of the Feature Request
The issue is a request for Dataverse to send additional license information to DataCite. Big thanks to @qqmyers for drafting this issue!

In v6.4, Dataverse sends rights information to DataCite based on the uri and name of the registered license. E.g.

<rightsList>
<rights rightsURI="info:eu-repo/semantics/restrictedAccess"/>
<rights rightsURI="https://qdr.syr.edu/policies/qdr-standard-access-conditions">Standard Access</rights>
</rightsList>

DataCite's v4.5 schema allows additional optional attributes, specifically

<xs:attribute name="rightsIdentifier" use="optional"/>
<xs:attribute name="rightsIdentifierScheme" use="optional"/>
<xs:attribute name="schemeURI" type="xs:anyURI" use="optional"/>
<xs:attribute ref="xml:lang"/>

They give an example:

<rights xml:lang="en" schemeURI="https://spdx.org/licenses/" rightsIdentifierScheme="SPDX" rightsIdentifier="CC-BY-4.0" rightsURI="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International</rights>

in which the rightsIdentifier, rightsIdentifierScheme, and schemeURI are taken from the license information on SPDX.

The example also uses a longer text description, i.e. what we currently capture as the 'description' (which SPDX considers to be the ‘name’).

It also includes an xml:lang=”en” attribute.

Completing this issue would involve:

  • Adding a means to store the rightsIdentifier, rightsIdentifierScheme, and schemeURI associated with a license (e.g. by adding columns to the license table and parsing the additional fields from the input json)
  • Updating the DataCite XML generation code to include these additional fields, when present, and to use the 'short description' rather than the 'name' (when provided) in the DataCite XML
  • Updating the existing JSON examples for licenses that have SPDX entries to include the additional fields
  • Creating a one-time mechanism to update existing db license entries, e.g. a FlyWay script (assuming there are no community concerns or we provide guidance on how to undo the update)
  • Updating the guidance in the Guides as to how to define new licenses with these additional attributes
  • Setting the xml:lang attribute to “en” for all licenses (since we currently define all licenses in English and use our existing Properties file mechanism to provide translations in the UI to other languages.) Alternately, no xml:lang tag could be sent if there are sites defining licenses directly in other languages rather than providing a Properties-based translation. (Note a possible follow-on issue to improve language handling)

Out-of-scope/Possible follow-on issues:

  • Using the new attributes in the Dataverse UI
  • Updating any other PID Provider or Exporter to use the new attributes
  • Support the native SPDX json format (see https://github.com/spdx/license-list-data/blob/main/json/licenses.json) as an alternative syntax for defining a Dataverse license. (Note that support for non-SPDX licenses needs to be retained - as for the example given above from QDR)
  • Support submitting licenses in different languages, indicating which language is used in the DataCite XML, and/or allowing the DataCite XML to use a license language that matches the platform language, or the dataset’s metadataLanguage.

What kind of user is the feature intended for?
API User, Curator, Depositor, Guest, Superuser, Sysadmin

What inspired the request?
Being able to deliver/expose license metadata to DataCite aligned with their recommendations.

What existing behavior do you want changed?
Some information being missing in the license metadata being delivered/exposed to DataCite.

Any brand new behavior do you want to add to Dataverse?
Not brand new, but completing the issue would involve some changes in the database and scripts; see the "Completing this issue would involve" section above.

Any open or closed issues related to this feature request?
This issue replaces issue #8512.

Are you thinking about creating a pull request for this feature?
Help is always welcome, is this feature something you or your organization plan to implement?
Yes, DataverseNO would like to sponsor and assist in any way we can to complete this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Size: 50 A percentage of a sprint. 35 hours. Type: Feature a feature request
Projects
Status: High priority
Status: Important
Status: ⚠️ Needed/Important
Development

No branches or pull requests

2 participants