-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Un-star operation to support RDF Dataset Canonicalization? #114
Comments
This suggests that RDF-star needs to describe a transformation from RDF 1.2 full (using Triple Terms) to a version without Triple Terms, possibly using |
This post compares some of many possible ways to map triple terms to RDF. One is similar to the RDF-reification-style unstar mapping defined in the RDF-star CG report. That can easily be adapted to many-to-many reifications. A second mapping is based solely on RDFS vocabulary. A mapping in the style of singleton properties is also possible, even in many-to-many situations. And last-not-least an n-ary mapping in the classic style of the W3C WG Note Defining N-ary Relations on the Semantic Web is presented. Instead of Reification-style mappingone-to-one:
one-to-many:
RDFS-style mappingone-to-one:
one-to-many:
Singleton property-style mappingone-to-one:
one-to-many:
N-ary-style mappingone-to-one:
one-to-many:
Discussion
|
The RDFS-style mapping is also problematic since the |
I'd favor the Reification-style mapping, which follows previously proposed entailment. If there is a need to set it apart from classic reification, a subclass of |
This was discussed during the rdf-star meeting on 24 September 2024. View the transcriptUn-star operation to support RDF Dataset Canonicalization?gkellogg: we talk about "un-star" since long time <bengo> w3c/rdf-star-wg#114 <gb> Issue 114 Un-star operation to support RDF Dataset Canonicalization? (by niklasl) [needs discussion] [discuss-f2f] pchampin: from the CWG: we defined RDF-Star semantics on top of the standard RDF semantics <Zakim> gkellogg, you wanted to discuss conflation with reifiers and graph names gkellogg: the issue is that we might create something that inserts triple in an existing named graph pchampin: using reifiers as graph names would definitely create a number of issues. I would rather go for encoding each triple term into a blanknode made singleton named graph pchampin: I try to keep the un-star mapping as liberal as possible. <Zakim> AndyS, you wanted to ask about scope of the solution AndyS: we might want to convert an RDF 1.1 graph with reification into a RDF 1.2 graph. gtw: we should do that per triple-term. it's natural thing to look at what that looks like per reifier. tl: Dydra already implements RDF-Star with named graphs. there is some experience <Zakim> bengo, you wanted to ask if unstar to graph and unstar to dataset are both useful to standardize for different reasons bengo: it would be useful to un-star to triples or graphs for different reasons. pchampin: to respond to AndyS about staring standard reification: that is for me a totally different problem, it was not my intention in that proposal gkellogg: regarding the notion to create named graphs per reifiers. niklasl: it's important to un-star to RDF "classic" for a number of reasons tl: we had an experiment with nested named graphs. the problem is that we have to extend SPARQL to query that. triple terms are much more powerful in that respect. ACTION: pchampin to write a PR on rdf-concepts for the unstar mapping <gb> Created action #129 ora: the question is how much effort do we want to put into edge cases that might not occur anyway pchampin: I will write a pull-request with some examples ora: this will go back into the backlog pchampin: let's scan the backlog to prepare for Thursday as well ora: good idea |
Based on the dataset proposal @pchampin brought up at TPAC, an RDF Full graph which uses Triple Terms might be decomposed into Named Graphs, either with a made up blank node graph name taking the place of the triple term, or the reifier serving as the graph name of a graph containing all triple terms related to that reifier. For example: <Alice> :bought <LennyTheLion> {|
a :Purchase ;
:seller :ToyStore ;
:date "2024-06"
|} {|
a :Purchase ;
:seller :Market ;
:date "2024-12"
|} . Might be turned into the following TriG without triple terms: <Alice> :bought <LennyTheLion> .
GRAPH _:b0 {
<Alice> :bought <LennyTheLion> .
}
_:b0 a :Purchase ;
:seller :ToyStore ;
:date "2024-06" .
GRAPH _:b1 {
<Alice> :bought <LennyTheLion> .
}
_:b1 a :Purchase ;
:seller :Market ;
:date "2024-12" . We might add a type to the generated blank nodes to aide in round-tripping, but it could be inferred from the use of the blank node naming the graph also being used as the subject of other triples. Note that if the reifier identified more than one triple term, all such triple terms would become triples in the related named graph. The alternative would create a blank node as the surrogate for the triple term rather than use the reifier. That might look like the following: <Alice> :bought <LennyTheLion> .
GRAPH _:r0 {
<Alice> :bought <LennyTheLion> .
}
_:b0 a :Purchase ;
rdf:reifies _:r0;
:seller :ToyStore ;
:date "2024-06" .
GRAPH _:r1 {
<Alice> :bought <LennyTheLion> .
}
_:b1 a :Purchase ;
rdf:reifies _:r1;
:seller :Market ;
:date "2024-12" . But, if there were multiple triple terms reified by the same reifier, they would each go into a separate graph. |
@gkellogg I like the surrogate approach much more because it does not only allow to decouple reifier name and graph name (which, as you pointed out during the TPAC meeting, might clash with existing graph names when importing into existing datasets), but also allows to describe the precise semantics of the connection between graph name and graph.
Why? Why could graph _:r1 in your last example not contain multiple statements? |
The idea was that the bond stands in for the triple term; this is reinforced by the range of rdf:reifies being rdf:TripleTerm (as proposed) so by this semantic, the blank node is a triple term, just made explicit by putting the triple derived from the triple term in a named graph. If that named graph contained multiple triples, it would no longer be a triple term. I favor the first interpretation, where the reifier is the graph name, as I think it makes it simpler to query all of the reified triples if they're in a single named graph, but this may complicate round-tripping. These are all issues to be discussed; I was just trying to illustrate my understanding of the discussion. |
@gkellogg I understand, and thanks for the illustrative summary! |
It would be helpful to decide the scope and requirements: Two I heard at the TPAC'24 meeting were:
The is not for or against the dataset approach - only these are significant design decisions to consider before detailed work on a particular route. There may be different transformations for different needs. The action: #129 |
Using the results of an unstarred graph or dataset have some possible requirements beyond or orthogonal to canonicalization. (Having it canonicalized may be an important prerequisite in some of them; e.g. checking integrity or logical diffs.) Given a reification-style mapping: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX : <http://example.org/ns#>
BASE <http://example.org/>
<Alice> :bought <LennyTheLion> .
[] a :Purchase ;
rdf:reifies _:b1 ;
:seller :ToyStore ;
:date "2024-06" .
[] a :Purchase ;
rdf:reifies _:b1 ;
:seller :Market ;
:date "2024-12" .
_:b1 a rdf:Triple ;
rdf:subject <Alice> ;
rdf:predicate :bought ;
rdf:object <LennyTheLion> . Some advantages for putting that to use would be:
Some notes:
|
To be clear, my proposal was the 2nd example of @gkellogg , namely the "surrogate" one (with the addition of a "special" namedgraph for identifying surrogate bnodes unambiguously. More precisely:
|
The object of a triple term can itself be a triple term which will need altering in S P O B. |
Should that not be "to the dataset D"? |
yes of course, thanks. I fixed it. |
@pchampin Does what I show below accurately illustrate your proposition?
RDF 1.1
|
If _:b1 is used with rdf:reifies, then the metadata block is arguably unnecessary, if the range of rdf:reifies is TripleTerm. |
What are the advantage and disadvantages of graph vs datasets? Dataset - can query for the triple as a triple ; needs the "admin" graph (or another way to distinguish these graphs). BTW rdf:ID in rdf/xml says the reification triples includes |
This was discussed during the rdf-star meeting on 26 September 2024. View the transcriptDefine an interpretation of Triple Terms 5niklasl: I propose to use the rdf:subject/predicate/object properties for the interpetation niklasl: This might not good to entail old reification from triple terms <Zakim> pfps, you wanted to ask how this relates to the semantics from Enrico? <niklasl> w3c/rdf-ucr#27 <gb> Issue 27 Integrating different ontology designs through entailment upon triple terms (by niklasl) [use case] <pfps> Where are the semantics from Enrico, by the way? ora: If we need some clarification on this, does it means we differ this one until Enrico shows up? pchampin: Does this point to a specific use case or is this a nice to have, <niklasl> https://gist.github.com/niklasl/69428b043be6f1d33fd45f89cbe52632#file-statement-entailment-ttl niklasl: there are use cases I defined previously AndyS: This seems to relate to the discussion around unstar AndyS: Done this way I don't see how we can add multiple triples to the same reifier tl: I have developed in a github issue multiple variants and they are all many-to-many one is RDF-vocabulary based <niklasl> Here is a comment on the unstarring issue I made yesterday w3c/rdf-star-wg#114 (comment) which relates to this issue about interpretation. <gb> Issue 114 Un-star operation to support RDF Dataset Canonicalization? (by niklasl) [needs discussion] [discuss-f2f] ora: I don't think we can reach a concensus here, is it a good discussion topic for next week after voting? <Zakim> tl, you wanted to ask about when rdfs:states will be discussed <gkellogg> JSON-LD-star slides – https://json-ld.github.io/w3c-tpac-2024-presentations/json-ld-star/ <AndyS> +1 to having the JSON-LD presentation in a focused meeting. ora: Thank you everybody |
It really bothers me that a mapping to named graphs is discussed which would be constrained to singleton graphs. Singleton graphs do of course map more directly to triple terms. However, we have decided to base our approach on many-to-many reifiers and it seems like a real waste to not make use of the set-nature of named graphs to implement them directly, instead of mimicking triple terms. I took up some aspects from various proposals above and mixed them a bit to see if they can be boiled down to a compact core. It turns out that triple terms and un-star mappings to RDF standard reification and RDF named graphs may even be mixed with each other, as the first set of examples illustrates, even if that may not be the intent of the proposed mappings (and I imagine that we might discuss to strictly forbid this). # a mixed environment
:r1 rdf:reifies
<<( :s :p :o1 )>> ,
<<( :s :p :o2 )>> ,
[ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o3 ] ,
[ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o4 ] ,
:g1 .
:g1 a rdf:Graph .
:g1 {
:s :p :o5 ,
:o6
} This is equivalent to the following serializations as triple terms, named graph and RDF 1.0 standard reification (and of course any other combinations thereof): # RDF 1.2 triple terms
:r1 rdf:reifies
<<( :s :p :o1 )>> ,
<<( :s :p :o2 )>> ,
<<( :s :p :o3 )>> ,
<<( :s :p :o4 )>> ,
<<( :s :p :o5 )>> ,
<<( :s :p :o6 )>> . # RDF 1.1 named graphs
:r1 rdf:reifies :g1 .
:g1 a rdf:Graph .
:g1 {
:s :p :o1 ,
:o2 ,
:o3 ,
:o4 ,
:o5 ,
:o6
} # RDF 1.0 standard reification
:r1 rdf:reifies
[ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o1 ] ,
[ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o2 ] ,
[ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o3 ] ,
[ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o4 ] ,
[ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o5 ] ,
[ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o6 ] . It should be pointed out that identifiers in the RDF 1.1. named graphs in the range of an Applying this to the running example: # statements
<Alice> :bought <LennyTheLion> .
<LennyTheLion> a :Puppet .
# annotations on statement reifications
_:b0 rdf:reifies _:r;
a :Purchase ;
:seller :ToyStore ;
:date "2024-06" .
_:b1 rdf:reifies _:r ;
a :Purchase ;
:seller :Market ;
:date "2024-12" .
# 3 alternative serializations of reifications
## RDF 1.2 triple terms
_:r owl:sameAs [
owl:intersectionOf
<<( <Alice> :bought <LennyTheLion> )>> ,
<<( <LennyTheLion> a :Puppet )>>
] .
## RDF 1.1 named graph
_:r owl:sameAs _:g .
_:g a rdf:Graph .
GRAPH _:g {
<Alice> :bought <LennyTheLion> .
<LennyTheLion> a :Puppet .
}
## RDF 1.0 standard reification
_:r owl:sameAs [
owl:intersectionOf
[ a rdf:triple ;
rdf:subject <Alice> ;
rdf:predicate :bought ;
rdf:object <LennyTheLion> ] ,
[ a rdf:triple ;
rdf:subject <LennyTheLion> ;
rdf:predicate rdf:type ;
rdf:object :Puppet ]
] . Pllease note that the |
[@rat10]
It may be another discussion, but I think that ignoring it will lead to problems in this discussion.
That is really two triples, being —
If you've used
I don't think that's what you meant to say, and it only gets worse when you bring in the other |
@TallTed Thanks, you're right, and I tried to correct the problem. If it makes the examples not only correct (I hope so) but also clearer is another question. It was probably a bad idea from the start to use |
@domel no, your example above uses the reifier as the name of the graph (remember that The result of my proposed "unstar" mapping would be
edited to fix the mistake spotted by @gkellogg below |
From my understanding, you have the bnodes confused. As :alice :says _:b1 .
_:b1 rdf:reifies _:b2.
_:b2 {
:bob :knows :charlie .
}
rdf:unstarMetadata {
_:b2 rdf:type rdf:TripleTerm .
} |
RDF dataset canonicalization appears to me to be outside the scope of the working group so I'm not sure why the working group should be worried about it. |
@pfps In the CG you proposed an unstar mapping, worked on it through several iterations and claimed it to be useful to describe the relation of triple terms to RDF standard reification (a claim that I still agree with, and that I find even more valid since the introduction of "reifiers"). What has changed? Or have I misunderstood you all the time? |
@rat10 Nothing has changed. The unstar mapping was useful with respect to the CG. RDF dataset canonicalization was not discussed in the CG, at least as far as I can remember. |
@pfps I think an unstar mapping has many useful applications, dataset canonicalization only being one of them. But to me dataset canonicalization is as good as any other application to discuss the unstar mapping. IMO we should do that, and I see no harm in using datatset canonicalization as the example use case. |
This issue is about using unstar to support RDF dataset canonicalization. If there are other reasons for defining an unstar operation they should be discussed in a different issue. |
The RDF Dataset Canonicalization specification (currently a Candidate Rec) is based on the abstract syntax of RDF 1.1.
What consequences do the additions in RDF 1.2 (prominently triple terms and language direction) have on this specification?
(This has been asked for in an email to the RDF-star WG from Phil Archer at 2023-11-06 and during the W3C Breakout Days RDF-star session 2024-03-12.)
It is reasonable that the (necessary) backwards-compatible "unstarring" of triple terms, along with a fallback representation of literals with both language and direction, may sufficiently deal with the difference for most practical purposes (at least for "well-formed" graphs).
At some point, to ensure interoperability going forward, the canonicalization algorithm reasonably has to be updated to deal with the RDF 1.2 abstract syntax.
The text was updated successfully, but these errors were encountered: