-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DCAT property for subsets ? #1527
Comments
If a classification system (A) is coherent and covering (non-overlapping, no gaps) for its scope, then if an individual class (C1) in the system is updated such that it changes the classification of other entities, then the update is breaking, and the classification system with the new concept (C1) MUST be identified as a new classification system (B). The issue is that if an entity is a member of class x under system A, it is not necessarily a member of class x under system B. versioning of Classification systems is very tricky! |
Correct @smrgeoinfo The statistical agencies are generally all across this. However DCAT is probably incomplete since I don't think we had anyone with expertise in official statistics in the conversations. |
I urge the DCAT editors to defer fine-grained versioning & change details to Dataset modelling and to not to try to cater for them at the DCAT the metadata level. Consider: a large dataset like the Australian Address Database has addresses added and removed every few months, so should it have a long list of Distributions? No! The Dataset is the overall thing and Address addition/removal/change is annotated at the Feature (sub-Dataset) level since it is complex and knowledge about what an 'Address' is - a sub-Dataset element - is needed to correctly use such information. My motivation for stepping in here is that I would hate to see DCAT get too expressive: more skill in the vocabulary will harm adoption for simple catalogues given the perception of it being "heavyweight" and broad adoption is more important to me that deep skill. Anyway, there are already many Semantic Web ways to model versioning issues (e.g. PAV) that are DCAT-compatible. So use DCAT for the catalogue and drop down into fine-grained versioning in PAV, SDMX/QB etc. as needed. |
I agree but this is NOT what the original use-case describes. The use-case is that a class in a given classification system A is described with explanatory notes, and these explanatory notes changes over time, but this does not lead a reclassification of entities, so we are not creating a new classification system B. And the history of the notes is kept. So basically, the question is what would be the recommended practice between:
For more details and regarding what XKOS suggests in terms of versioning of notes in statistical classification, see http://linked-statistics.github.io/xkos/xkos-best-practices.html#bp-notes-versioning-timestamping |
Just my opinion, but to me, if the changes do not cause reclassification of entities (or introduction of new subcategories), then it would make sense to me to have one distribution with all the notes (assuming they are time stamped in some way). |
Yes notes are timestamped (see link sent previously for details). And so the question is : would this be 2 distributions of the same dataset ? or 2 datasets (but this may not be practical for reusers) ? and how to identify/link/tag those distributions or datasets ? |
Before further studying, I noticed that Can we close this issue? |
For the moment we have simply differed the answer and we have actually referred to this very issue; here is how the XKOS best practices document now reads:
Exact pointer : http://linked-statistics.github.io/xkos/xkos-best-practices.html#bp-publishing-classification So, no I don't think the issue should be closed. |
The content of a statistical classification evolves over time with explanatory notes for items that may change slightly and have successive versions. While it is important to point directly to the current version of all items in the classification, it is also relevant to obtain the history of all items. It would therefore be useful to be able to distinguish two dcat:distributions which correspond to the current notes of a classification on the one hand and to the whole history on the other.
dcat:hasCurrentVersion may not be exactly what we need. This property could separate different versions that correspond to the same master object. In our case, it is more of a content restriction to the latest versions of the notes.
A subproperty of dcat:Distribution whose scope is a subset corresponding to the current contents of a dcat:dataset would be relevant. Or would dcat:hasCurrentVersion still make sense anyway?
Deliverable(s): XKOS Best Practices
http://linked-statistics.github.io/xkos/xkos-best-practices.html#issue-container-number-12
The text was updated successfully, but these errors were encountered: