-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identifier and IdentifierSystem #137
Comments
@VladimirAlexiev, as a coauthor of BuisnessGaph ontology, can you please comment if the proposed reuse of it is adequate? |
Hi @Fak3 , thanks for referencing EuBusinessGraph!
|
@VladimirAlexiev at w3c-ccg/traceability-vocab#944 you proposed to use |
Not sure how to resolve this one. @Fak3 says As currently suggested in DigitalproductPassport.md, identifier is described with idScheme, idValue, idSchemeName:
The problem is that those properties are assigned not to the specific identifier, but to the entity, which can have multiple identifiers with different identification schemes. I don't think I agree with the assertion that these properties are not specific to the identifier. the id is the full URI of the identifier - globally unique Lets imagine the same business entity has another identifier in another business register
Totally different values for every property. There in nothing that demands an entity to use their national registered legal entity name when creating a GLN with GS1 - they might choose something more like their trading name. Could you clarify the problem here? |
{
"id": "https://gln.gs1.org/1234567",
"name": "ACME Industries",
"idValue": "1234567",
"idScheme": "gln.gs1.org",
"idSchemeName": "Global Location Number"
} and {
"id": "https://abr.business.gov.au/ABN/View?abn=90664869327",
"name": "ACME Pty Ltd",
"idValue": "90664869327",
"idScheme": "abr.business.gov.au",
"idSchemeName": "Australian Business Number"
}
{
"id": "https://abr.business.gov.au/ABN/View?abn=90664869327"
"owl:sameAs": "https://gln.gs1.org/1234567"
}
{
"id": "https://gln.gs1.org/1234567",
"owl:sameAs": "https://abr.business.gov.au/ABN/View?abn=90664869327",
"name": ["ACME Industries", "ACME Pty Ltd"],
"idValue": ["1234567", "90664869327"],
"idScheme": ["gln.gs1.org", "abr.business.gov.au"],
"idSchemeName": ["Global Location Number", "Australian Business Number"]
} and {
"id": "https://abr.business.gov.au/ABN/View?abn=90664869327",
"owl:sameAs": "https://gln.gs1.org/1234567",
"name": ["ACME Industries", "ACME Pty Ltd"],
"idValue": ["1234567", "90664869327"],
"idScheme": ["gln.gs1.org", "abr.business.gov.au"],
"idSchemeName": ["Global Location Number", "Australian Business Number"]
} Thus, the original intent of assigning The issue here is that with the current data model intended separation between the distinctly identified nodes can't be ensured. The RDF states that properties describe entity itself, while we are currently assuming that properties describe entity's identifier, violating that rule. So my proposal is to separate identifier metadata into its own node, which explicitly describes identifier of that original entity. In that proposal identifier becomes a distinct entity (graph node with type |
In addition, it makes sense to split out a separate Currently you have only 2 props:
but in the future you may have more, eg:
You can read about it in the euBusinessGraph Semantic Data Model https://docs.google.com/document/d/1dhMOTlIOC6dOK_jksJRX0CB-GIRoiYY6fWtCnZArUhU/edit#heading=h.hofh07qhoz6m |
OK I'll separate out the id scheme into it's own class with it's own "id" so that the graph will have a seaprate node for identity schemes. |
This issue still exists on current published spec: "issuer": {
"type": "CredentialIssuer",
"id": "did:web:identifiers.acme.com:12345",
"name": "ACME industries",
"otherIdentifiers": [{
"type": "Entity",
"id": "https://abr.business.gov.au/ABN/View?abn=90664869327",
"name": "ACME Pty Ltd",
"idValue": "90664869327",
"idScheme": "abr.business.gov.au",
"idSchemeName": "Australian Business Number"
}]
}, Should be: "issuer": {
"type": "CredentialIssuer",
"id": "did:web:identifiers.acme.com:12345",
"name": "ACME industries",
"identifier": [{
"type": "Identifier",
"notation": "https://abr.business.gov.au/ABN/View?abn=90664869327",
"name": "ACME Pty Ltd",
"isPartOf": {
"id": "abr.business.gov.au/ABN",
"type": "IdentifierSystem",
"name": "Australian Business Number"
}
}]
}, |
I missed to update the sample snippet on the the page. But the model and schema and sample at the top of the page is different -please check https://uncefact.github.io/spec-untp/assets/files/untp-digital-product-passport-v0.3.6-86c6ee585e0905f8871b40838616f9ff.json |
This sample has the same issue: "issuer": {
"type": [
"CredentialIssuer"
],
"id": "did:web:identifiers.example-company.com:12345",
"name": "Example Company Pty Ltd",
"otherIdentifiers": [
{
"type": [
"Entity"
],
"id": "https://business.gov.au/ABN/View?abn=1234567890",
"name": "Sample Company Pty Ltd",
"registeredId": "1234567890",
"idScheme": {
"type": [
"IdentifierScheme"
],
"id": "https://business.gov.au/ABN/",
"name": "Australian Business Number"
}
}
]
}, Should be: "issuer": {
"type": "CredentialIssuer",
"id": "did:web:identifiers.example-company.com:12345",
"name": "Example Company Pty Ltd",
"identifier": [
{
"type": "Identifier",
"notation": "https://business.gov.au/ABN/View?abn=1234567890",
"name": "Sample Company Pty Ltd",
"isPartOf": {
"type": "IdentifierSystem",
"id": "https://business.gov.au/ABN-HTTP",
"name": "Australian Business Number URL"
}
},
{
"type": "Identifier",
"notation": "1234567890",
"name": "Sample Company Pty Ltd",
"isPartOf": {
"type": "IdentifierSystem",
"id": "https://business.gov.au/ABN",
"name": "Australian Business Number"
}
}
]
}, Also note that "https://business.gov.au/ABN-HTTP" and plain "https://business.gov.au/ABN" are different IdentifierSystems |
I think there might be a linked data architecture or strategy question behind this issue. I think it boils down to a question of whether entities should be merged with there is some kind of equivalence declared or only when identifiers are exactly identical. If an entity with ID = abn.gov.au/123454567 declares "otherIidentifiers" like gs1.org/gln/9876543 then does this mean they are the same and all data about abn.gov.au/123454567 and gs1.org/gln/9876543 should be merged?
In the example given, an ABN is an Australian national business tax registration number. A GLN is a GS1 identifier for a logistics location. In some cases where a business has only one operating location these two IDs could resolve to very similar things. But even so, a legal tax registration is not the same as a logistics location. Also, as soon as the business opens a second location and creates another GLN there will be far worse inconsistencies associated with any merge. I suggest add some words in the graphs section of UNTP to emphasise that meta data of two entities should only be merged when the declared identifiers are identical (eg two instances of gs1.org/gln/9876543) but never when two different identifiers are declared to be related and possibly equivalent. |
Section "9.2 Equivalence" of gs1 digital link spec https://www.gs1.org/docs/Digital-Link/GS1_Digital_link_Standard_i1.1.pdf mentions use of owl:sameAs As well as the section "7.2 Decompression": @philarcher @mgh128 Can you please tell if you see a realistic scenario for a single product referenced by several different gs1 links? For ex. issuer1 uses compressed link, issuer2 uses uncompressed GTIN+batch. Could both links reference the same product in two different Product Passports or Conformity Credentials? If we forbid to process
|
Hi, in the EU context the Nordic business register authority cooperation advocates the use of adms:Identifier but in the way it's been modeled in the EU Core Vocabularies: Here all the attributes are directly properties of the Identifier class... Then again there's a reference to "the UN/CEFACT class with the same name" - but in the UN/CEFACT CCL these attributes are actually includet in the uDT (IdentifierType) - and there they are indeed mostly properties of the "Identification Scheme". An example of a Finnish implementation of EU Core Business Vocabularies > "Identifier" |
The gs1 doc basically says "use owl:sameAs" with care - only when you are really sure the two different t identifiers refer to the same thing. UNTP is not going to specify owl:sameAs in any scenario - that's a choice of the processor using vocabularies we don't control UNTP can only recommend that "If two things have different identifiers then dint legę them". I really don't understand why this is even a contentious issue? |
We don't really leave a choice to the processors. If they receive and accept equivalence statements, they face the inconsistent graph, as described in the comments above. So to prevent inconsistent processing we must either fix data model as proposed here, or document specific guidance how to deal with inconsistency. |
Ok fair enough. But where are we (ie UNTP) making any equivalence statement using owl:sameAs ? Is there an assumption that "otherIdentifiers" or "alsoKnownAs" should be interpreted as "owl:sameAs"? |
We do reference using GS1 digital link in our IdentityResolver.md GS1 digital link spec mandates the owl:sameAs relationship between short (compressed) and full links |
Thank you! I believe it is important to align with EU Core business vocabulary and UN/CEFACT CCL. We should reuse same data model and have dedicated |
The uncefact ccl defines an identifier data type that mixes both the ID of the entity and the ID of the identifier scheme in one class. Which is what I thought was exactly what you are objecting to? |
No, as we discussed on slack, separating identifierSystem metadata is not that important, as it does not lead to inconsistent graph. |
But surely we are talking about two different questions here. Short and full links are just different technical representations of exactly the same registry entry. There's not even a merge question here because they both point to the exact same entry. But whether or not to merge two entries across two different registers is not the same question. |
"http://example.org/gtin/054123450013/lot/ABC%26%2B123?3103=000189&3923=2172" |
Those are not different schemes. They are both the same GTIN. Just different technical representations of the same thing. "Scheme" does not refer to a technical syntax. A different scheme means a different register (like ABN vs GLN). |
For the example of same schema and different registries, there is an european union open data endpoint: https://data.europa.eu/data/sparql?locale=en queriying it with
i.e there are two registries: 1. data.brreg.no 2. register.geonorge.no Do we ever encounter such cases in untp? |
http://example.org/01/054123450013/10/ABC%26%2B123?3103=000189&3923=2172 is an example of a fully uncompressed GS1 Digital Link URI. (3103)000189(01)05412345000013(3923)2172(10)ABC&+123 is an example of a corresponding GS1 element string using parentheses around the GS1 Application Identifiers. It is not a GS1 Digital Link URI nor any kind of compressed format. It would be OK to express an owl:sameAs relationship between an uncompressed GS1 Digital Link URI and the exactly equivalent compressed or partially compressed GS1 Digital Link URIs but only if they identify the same thing, i.e. if the fully/partially compressed GS1 Digital Link URI encodes the same combination of GS1 Application Identifiers and their values. In practice, GS1 does not currently recommend the use of fully compressed or partially compressed GS1 Digital Link URIs within 2D barcodes for products. In most situations, a cautious use of upper-case alphanumeric characters and very few symbol characters enables efficient QR encoders to use the "alphanumeric" mode rather than "binary/byte" mode and this typically achieves an equivalent reduction in size of QR Code without the complexity of handling compression or decompression. I would expect that when GS1 Digital Link URIs are used within Linked Data or within Verifiable Credentials, it would be the fully uncompressed format, without any compression. I hope this helps. |
I think the discussion went astray.
|
I still do not understand. Every node in a graph needs an identifier. So if there is an Entity class it must have an id. Separately that id may be issued under a governed scheme which itself has an id. So I understand if the requirement is "entities (with id) should be a separate class from identifierScheme (with it's own id)". But I don't understand what it means to say "we need sepaate classes for entity and identifier". If the id of an entity is in a separate class then what is the id of the entity class?? |
@onthebreeze graph node must have exactly one primary id. Also it can have additional identifiers associated with it via properties. Instance of a Product class, as an abstract concept can be independently described by multiple parties (issuers), while each party independently can choose a different primary id for the graph node which represents that same instance of a Product class in their own separate graph. Verifier receives those separate graphs and some additional data which suggests that those parties indeed chose different primary id for the same Product instance. Now knowing that those ids refer to the same entity, he can correctly apply business rules, treating properties of those separate graph nodes as if they belong to the same entity. So one issuer chose one identifier of an entity and promoted it to be graph node's primary id. Then, as currently suggested, he associates idScheme and all other product properties with this primary id. Note here is that all of these properties describe a particular physical product instance, except for idScheme which describes the abstract arbitrarily chosen primary id of a node in the graph. Now the verifier attempting to apply business logic must be careful, because it faces data with physical properties of a product, which does not depend on the issuers choice, mixed with multiple idScheme properties, each one of idScheme is only valid for a particular choice of node's primary id of the issuer. |
I have just stumbled upon a recommendation to use |
Currently (and probably for the foreseeable future), GS1 only recommends
the use of uncompressed GS1 Digital Link URIs - neither fully compressed or
partially compressed GS1 Digital Link URIs are supported by GS1 application
standards.
However, it would still be acceptable to use an owl:sameAs relationship
between an uncompressed GS1 Digital Link URI formed from a registered
domain name and the corresponding canonical GS1 Digital Link URI using
id.gs1.org as the hostname.
For Digital Product Passport data, I fully expect that factual claims will
be expressed at various granularities of identification - some facts may
apply to every instance of a product having that GTIN, while others may be
specific to a specific GTIN+Batch and others may be specific to one
individual product instance identified by GTIN+SerialNumber.
An AIDC data carrier such as a 2D barcode might encode GTIN, CPV,
Batch/Lot, SerialNumber but some data (such as EPCIS event data for supply
chain traceability / visibility) might only use GTIN+SerialNumber when
reporting that an individual object was observed at a location or
participated in a particular business process step. That's because
GTIN+SerialNumber provides the finest granularity of identification and
from those details, CPV or Batch/Lot should be accessible via lookup of the
master data for that GTIN+SerialNumber.
…On Mon, Sep 2, 2024 at 11:21 AM Evstifeev Roman ***@***.***> wrote:
Section "9.2 Equivalence" of gs1 digital link spec
https://www.gs1.org/docs/Digital-Link/GS1_Digital_link_Standard_i1.1.pdf
mentions use of owl:sameAs
Screenshot_20240902_151359.jpg (view on web)
<https://github.com/user-attachments/assets/81904942-54e0-4b46-b845-9245172edcd0>
As well as the section "2.3 Decompression":
Screenshot_20240902_151201.jpg (view on web)
<https://github.com/user-attachments/assets/8230402b-2ff3-4f28-887a-021a090d4a23>
@philarcher <https://github.com/philarcher> @mgh128
<https://github.com/mgh128> Can you please tell if you see a realistic
scenario for a single product referenced by several different gs1 links?
For ex. issuer1 uses compressed link, issuer2 uses uncompressed GTIN+batch.
Could both links reference the same product in two different Product
Passports or Conformity Credentials?
If we forbid to process owl:sameAs on the verifier side, we must document
that clearly in the spec. This restriction will force verifiers to to do
one or a combination of the following:
1. construct more complex SPARQL queries
2. strip off incoming owl:sameAs statements
3. apply custom graph merging rules (ignore or replace idScheme
property)
—
Reply to this email directly, view it on GitHub
<#137 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABSXRL64ADFK7R4TYGXF7H3ZUQ3ZDAVCNFSM6AAAAABLU2NBZGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRUGM3DQOJWGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I think the outcome of all this is
Closing this unless there are objections. |
Products and Organizations can have multiple identifiers expressed using various identifier schemes.
Current model issues
As currently suggested in DigitalproductPassport.md, identifier is described with idScheme, idValue, idSchemeName:
The problem is that those properties are assigned not to the specific identifier, but to the entity, which can have multiple identifiers with different identification schemes.
Let's imagine this json-ld data is stored and processed by owl inferencer. And in their graph database they already reflect that this organization have two identifiers:
Processing this new data according to owl:sameAs semantics, the owl inferencer will add new triples to the graph:
This does not make sense, as it means Decentralized Identifier conforms to ABN identifier scheme.
Proposed model
To resolve the issue, identifiers should be described separately from the entity itself:
In the example above I omitted
"type": "Identifier"
but included explicit"type": "IdentifierSystem"
. Whether we should require those types explicitly declared in documents is debatable.I suggest reusing existing vocabularies:
adms:Identifier
based on the UN/CEFACT Identifier class.Properties:
ebg:IdentifierSystem
Class
ebg:IdentifierSystem
from BusinessGraph vocabularyDefinition from ontology.ttl: "A system managed by a publisher (e.g., a register or agency) that is used to issue identifiers to entities (companies, persons, etc)."
Properties:
Property
adms:identifier
that links a resource to theadms:Identifier
.There are other properties in the BusinessGraph ontology wich can be reused, for ex jurisdiction, issuance and expiration date, etc... We should probably add them to our json-ld context file as well.
Related to #135
Similar issue was discussed on traceability vocab: w3c-ccg/traceability-vocab#944
The text was updated successfully, but these errors were encountered: