Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Un-star operation to support RDF Dataset Canonicalization? #114

Open
niklasl opened this issue Mar 21, 2024 · 29 comments
Open

Un-star operation to support RDF Dataset Canonicalization? #114

niklasl opened this issue Mar 21, 2024 · 29 comments

Comments

@niklasl
Copy link

niklasl commented Mar 21, 2024

The RDF Dataset Canonicalization specification (currently a Candidate Rec) is based on the abstract syntax of RDF 1.1.

What consequences do the additions in RDF 1.2 (prominently triple terms and language direction) have on this specification?

(This has been asked for in an email to the RDF-star WG from Phil Archer at 2023-11-06 and during the W3C Breakout Days RDF-star session 2024-03-12.)

It is reasonable that the (necessary) backwards-compatible "unstarring" of triple terms, along with a fallback representation of literals with both language and direction, may sufficiently deal with the difference for most practical purposes (at least for "well-formed" graphs).

At some point, to ensure interoperability going forward, the canonicalization algorithm reasonably has to be updated to deal with the RDF 1.2 abstract syntax.

@gkellogg
Copy link
Member

gkellogg commented Apr 4, 2024

This suggests that RDF-star needs to describe a transformation from RDF 1.2 full (using Triple Terms) to a version without Triple Terms, possibly using rdf:Statement. Ideally, this would allow such graphs/datasets to round-trip back to use Triple Terms. In this case, RDF Dataset Canonicalization may just need to describe that these steps are taken before canonicalization.

@rat10
Copy link
Contributor

rat10 commented Aug 5, 2024

This post compares some of many possible ways to map triple terms to RDF. One is similar to the RDF-reification-style unstar mapping defined in the RDF-star CG report. That can easily be adapted to many-to-many reifications. A second mapping is based solely on RDFS vocabulary. A mapping in the style of singleton properties is also possible, even in many-to-many situations. And last-not-least an n-ary mapping in the classic style of the W3C WG Note Defining N-ary Relations on the Semantic Web is presented.

Instead of rdf:reifies a more neutral ex:identifies is used because some mappings do rather cater to instantiation (and actual stating) than reification (and mere describing).

Reification-style mapping

one-to-one:

    _:r ex:identifies <<( :s :p :o )>> ; 
        :a :b .

    <=>

    _:r ex:identifies [
            rdf:subject :s ;
            rdf:predicate :p ;
            rdf:object :o 
          ] ;
        :a :b .

one-to-many:

    _:r ex:identifies <<( :s :p :o )>> ,
                      <<( :x :y :z )>> ; 
        :a :b .

    <=>

    _:r ex:identifies [
            rdf:subject :s ;
            rdf:predicate :p ;
            rdf:object :o 
          ] ,  [
            rdf:subject :x ;
            rdf:predicate :y ;
            rdf:object :z 
          ] ;
        :a :b .

RDFS-style mapping

one-to-one:

    _:r ex:identifies <<( :s :p :o )>> ; 
        :a :b .

    <=>

    _:r ex:identifies [
            rdfs:subPropertyOf :p ;
            rdfs:domain :s ;
            rdfs:range :o 
          ] ;
        :a :b .

one-to-many:

    _:r ex:identifies <<( :s :p :o )>> ,
                      <<( :x :y :z )>> ; 
        :a :b .

    <=>

    _:r ex:identifies [
            rdfs:subPropertyOf :p ;
            rdfs:domain :s ;
            rdfs:range :o 
          ] , [
            rdfs:subPropertyOf :y ;
            rdfs:domain :x ;
            rdfs:range :z 
          ] ;
        :a :b .

Singleton property-style mapping

one-to-one:

    _:r ex:identifies <<( :s :p :o )>> ; 
        :a :b .

    <=>

    _:r ex:identifies :r1 ;
        :a :b .
    :s :r1 :o .
    :r1 rdf:value :p .

one-to-many:

    _:r ex:identifies <<( :s :p :o )>> ,
                      <<( :x :y :z )>> ; 
        :a :b .

    <=>

    _:r ex:identifies :r1 ,
                      :r2 ;
        :a :b .
    :s :r1 :o .
    :r1 rdf:value :p .
    :x :r2 :z .
    :r2 rdf:value :y .

N-ary-style mapping

one-to-one:

    _:r ex:identifies <<( :s :p :o )>> ; 
        :a :b .

    <=>

    _:r ex:identifies _:r1 ;
        :a :b .
    :s  :p  _:r1 .
    _:r1 rdf:value :o .

one-to-many:

    _:r ex:identifies <<( :s :p :o )>> ,
                      <<( :x :y :z )>> ; 
        :a :b .

    <=>

    _:r ex:identifies _:r1 ,
                      _:r2 ;
        :a :b .
    :s :p _:r1 .
    _:r1 rdf:value :o .
    :x :y _:r2  .
    _:r2 rdf:value :z .

Discussion

  • all approaches are able to map many-to-many relations.
  • all approaches seem to be equally well suited to round-tripping (but I haven't tested that)
  • all approaches make it possible to annotate individual reified terms in one-to-many reifications. It would not be possible to map such annotations back to RDF-star
  • the RDFS-style mapping is a bit unorthodox, which may or may not be considered an advantage, depending on preference. At least this makes it easier for systems to disambiguate old-school RDF reifications from mapped RDF-star triple terms.
  • the singleton property approach it is harder to optimize for querying. It may also not be that easy to understand, although that might be a question of habituation. Because predicates can't be blank nodes in RDF, the otherwise hidden identifiers of individual triple terms have to be exposed.
  • the n-ary mapping is the closest to standard RDF
  • the reification based mapping seems best suited to represent rdf:reifies, whereas the other mappings seem to have a slight (RDFS-based) or even pretty strong (n-ary) leaning towards rdfs:stated terms.
  • the examples re-use existing RDF vocabulary wherever possible, but it may be prudent to mint un-star specific properties for the sake of disambiguation and solid round-tripping.
  • all approaches become more involved because of the need to model not only one/many-to-one but also one/many-to-many reifications. Especially the n-ary mapping suffers from this, as the following example illustrates, which is stricly one/many-to-ONE:
    _:r ex:identifies <<( :s :p :o )>> ; 
        :a :b .

    <=>

    :s  :p  [ rdf:value :o ;
              :a :b ] .
  • N-ary relations that branch out from the subject position might be considered appealing because of their straightforward structure, but suffer from the fact that they can't disambiguate statement from annotation:
    _:r ex:identifies <<( :s :p :o )>> ; 
        :a :b .

    <=>

    _:r :rdf:value :s ;
        :p :o ;
        :a :b .

@niklasl
Copy link
Author

niklasl commented Aug 8, 2024

The RDFS-style mapping is also problematic since the rdfs:range of both rdfs:domain and rdfs:range is rdfs:Class, so it would be inferred that every subject and object of any "unstarred" triple term would be of rdf:type rdfs:Class.

@niklasl
Copy link
Author

niklasl commented Aug 8, 2024

I'd favor the Reification-style mapping, which follows previously proposed entailment. If there is a need to set it apart from classic reification, a subclass of rdf:Statement could be defined (e.g. rdf:Relationship). Triple term resources could have a uniqueness constraint (which could be formally defined using owl:hasKey (rdf:subject rdf:predicate rdf:object), and technically by minting an identifier from its constituents).

@gkellogg gkellogg changed the title How will RDF 1.2 affect RDF Canonicalization? Un-star operation to support RDF Dataset Canonicalization? Sep 19, 2024
@ktk ktk added the discuss-f2f Proposed for discussion during the next face-to-face meeting label Sep 19, 2024
@pchampin
Copy link
Contributor

This was discussed during the rdf-star meeting on 24 September 2024.

View the transcript

Un-star operation to support RDF Dataset Canonicalization?

gkellogg: we talk about "un-star" since long time
… we want to transform the representation in some form of other representation
… I assumed this will be standard reification.
… is the mechanism by which we transform triple terms simply reification triples?
… should we use different types and properties for reification triples?

<bengo> w3c/rdf-star-wg#114

<gb> Issue 114 Un-star operation to support RDF Dataset Canonicalization? (by niklasl) [needs discussion] [discuss-f2f]

pchampin: from the CWG: we defined RDF-Star semantics on top of the standard RDF semantics
… we are using the same term "un-star" for a totally different purpose now
… many people asked why do you not just singleton named graphs.
… I inteded to write something and share it in advance but didn't manage to.
… do we want to have the "un-star" mapping to be lossless?
… I have a simpler version but it's not 100% lossless

<Zakim> gkellogg, you wanted to discuss conflation with reifiers and graph names

gkellogg: the issue is that we might create something that inserts triple in an existing named graph

pchampin: using reifiers as graph names would definitely create a number of issues. I would rather go for encoding each triple term into a blanknode made singleton named graph
… we encode the triple term into a singleton named graph that is a blank node
… we also add another graph that says "this blank node is a triple term"
… and any other blank node that is a triple term.

pchampin: I try to keep the un-star mapping as liberal as possible.
… if there is no triple term in an existing dataset this should work. but if you have already an un-star set in it, it becomes an edge-case
… with that we could convert every RDF-Star 1.2 into RDF 1.1 "classic"

<Zakim> AndyS, you wanted to ask about scope of the solution

AndyS: we might want to convert an RDF 1.1 graph with reification into a RDF 1.2 graph.
… what pchampin talked about, it got complicated once you said you want to put a dataset into a dataset that already contains a graph that has reification
… we might simplify that by saing it's two datasets and it becomes a merge operation

gtw: we should do that per triple-term. it's natural thing to look at what that looks like per reifier.

tl: Dydra already implements RDF-Star with named graphs. there is some experience
… they are happy to share the experience.
… The mapping to standard triples with the RDF reification vocab would be useful too and I would like to have it lossless

<Zakim> bengo, you wanted to ask if unstar to graph and unstar to dataset are both useful to standardize for different reasons

bengo: it would be useful to un-star to triples or graphs for different reasons.

pchampin: to respond to AndyS about staring standard reification: that is for me a totally different problem, it was not my intention in that proposal
… I had two goals: Canonicalization & flattening

gkellogg: regarding the notion to create named graphs per reifiers.
… querying would become much more difficult.

niklasl: it's important to un-star to RDF "classic" for a number of reasons
… for example to be able to add it to an existing graph store as soon as possible
… the problem is union graphs that many stores do.
… I believe using classic reification properties is frugal.

tl: we had an experiment with nested named graphs. the problem is that we have to extend SPARQL to query that. triple terms are much more powerful in that respect.
… it wouldn't be that easy with just named graphs. and also other reasons. things get tricky on SPARQL level

ACTION: pchampin to write a PR on rdf-concepts for the unstar mapping

<gb> Created action #129

ora: the question is how much effort do we want to put into edge cases that might not occur anyway

pchampin: I will write a pull-request with some examples

ora: this will go back into the backlog

pchampin: let's scan the backlog to prepare for Thursday as well

ora: good idea


@gkellogg
Copy link
Member

Based on the dataset proposal @pchampin brought up at TPAC, an RDF Full graph which uses Triple Terms might be decomposed into Named Graphs, either with a made up blank node graph name taking the place of the triple term, or the reifier serving as the graph name of a graph containing all triple terms related to that reifier.

For example:

<Alice> :bought <LennyTheLion> {|
    a :Purchase ;
    :seller :ToyStore ;
    :date "2024-06"
  |} {|
    a :Purchase ;
    :seller :Market ;
    :date "2024-12"
  |} .

Might be turned into the following TriG without triple terms:

<Alice> :bought <LennyTheLion> .

GRAPH _:b0 {
    <Alice> :bought <LennyTheLion> .
}

_:b0 a :Purchase ;
    :seller :ToyStore ;
    :date "2024-06" .

GRAPH _:b1 {
    <Alice> :bought <LennyTheLion> .
}

_:b1 a :Purchase ;
    :seller :Market ;
    :date "2024-12" .

We might add a type to the generated blank nodes to aide in round-tripping, but it could be inferred from the use of the blank node naming the graph also being used as the subject of other triples.

Note that if the reifier identified more than one triple term, all such triple terms would become triples in the related named graph.

The alternative would create a blank node as the surrogate for the triple term rather than use the reifier. That might look like the following:

<Alice> :bought <LennyTheLion> .

GRAPH _:r0 {
    <Alice> :bought <LennyTheLion> .
}

_:b0 a :Purchase ;
    rdf:reifies _:r0;
    :seller :ToyStore ;
    :date "2024-06" .

GRAPH _:r1 {
    <Alice> :bought <LennyTheLion> .
}

_:b1 a :Purchase ;
    rdf:reifies _:r1;
    :seller :Market ;
    :date "2024-12" .

But, if there were multiple triple terms reified by the same reifier, they would each go into a separate graph.

@rat10
Copy link
Contributor

rat10 commented Sep 24, 2024

@gkellogg I like the surrogate approach much more because it does not only allow to decouple reifier name and graph name (which, as you pointed out during the TPAC meeting, might clash with existing graph names when importing into existing datasets), but also allows to describe the precise semantics of the connection between graph name and graph.

But, if there were multiple triple terms reified by the same reifier, they would each go into a separate graph.

Why? Why could graph _:r1 in your last example not contain multiple statements?

@gkellogg
Copy link
Member

@gkellogg I like the surrogate approach much more because it does not only allow to decouple reifier name and graph name (which, as you pointed out during the TPAC meeting, might clash with existing graph names when importing into existing datasets), but also allows to describe the precise semantics of the connection between graph name and graph.

But, if there were multiple triple terms reified by the same reifier, they would each go into a separate graph.

Why? Why could graph _:r1 in your last example not contain multiple statements?

The idea was that the bond stands in for the triple term; this is reinforced by the range of rdf:reifies being rdf:TripleTerm (as proposed) so by this semantic, the blank node is a triple term, just made explicit by putting the triple derived from the triple term in a named graph. If that named graph contained multiple triples, it would no longer be a triple term.

I favor the first interpretation, where the reifier is the graph name, as I think it makes it simpler to query all of the reified triples if they're in a single named graph, but this may complicate round-tripping. These are all issues to be discussed; I was just trying to illustrate my understanding of the discussion.

@rat10
Copy link
Contributor

rat10 commented Sep 24, 2024

@gkellogg I understand, and thanks for the illustrative summary!
I just think that restricting reifier graphs to singletons is a step backwards from what we have right now with reifiers being many-to-many, and actually quite surprising given that named graphs are, well, graphs.
I agree that the first interpretation is simpler, but I fear it's too underspecified. It is really just plain RDF 1.1 named graphs, without any further fixings. Why would we have had all those discussions about semantics (and named graphs' lack thereof) and then suddenly all that wouldn't matter anymore and named graphs are good enough? Well, certainly stuff for interesting discussions (and I very much welcome the initiative!).

@afs
Copy link
Contributor

afs commented Sep 25, 2024

It would be helpful to decide the scope and requirements:

Two I heard at the TPAC'24 meeting were:

  1. The dataset proposal would preclude working with an un-star graph as a graph
  2. Whether the output is a standalone dataset/graph (for canonicalization) or to be combined with other data.

The is not for or against the dataset approach - only these are significant design decisions to consider before detailed work on a particular route.

There may be different transformations for different needs.


The action: #129

@niklasl
Copy link
Author

niklasl commented Sep 25, 2024

Using the results of an unstarred graph or dataset have some possible requirements beyond or orthogonal to canonicalization. (Having it canonicalized may be an important prerequisite in some of them; e.g. checking integrity or logical diffs.)

Given a reification-style mapping:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX : <http://example.org/ns#>
BASE <http://example.org/>

<Alice> :bought <LennyTheLion> .

[] a :Purchase ;
    rdf:reifies _:b1 ;
    :seller :ToyStore ;
    :date "2024-06" .

[] a :Purchase ;
    rdf:reifies _:b1 ;
    :seller :Market ;
    :date "2024-12" .

_:b1 a rdf:Triple ;
    rdf:subject <Alice> ;
    rdf:predicate :bought ;
    rdf:object <LennyTheLion> .

Some advantages for putting that to use would be:

  • It is confined within one graph (one interpretation), and will work with existing triple and quad stores as is.
  • Triple term occurrences in graphs will not be asserted in a "default union graph".
    • This is a practical concern given how existing graph stores and many SPARQL engines work. Few implementations allow a default union graph alongside named graphs who are not part of that union in a convenient way (such as a union of millions of named graphs (not feasible using FROM), excluding all singleton graphs reified within each). That showed that just making the RDF-star syntax de-sugar to named graphs could not work well for querying (and would be tricky for storing and updating).
  • Both the above and the "surrogate" approach share one characteristic: rdf:reifies is present in the results, and can be subject to e.g. OWL restrictions on cardinality. (Relevant for interop with RDF 1.1-based systems.)
  • The above form should be isomorphic to the interpretation of triples, in order to facilitate upgrades transparently (querying and reasoning could work across RDF 1.1 and 1.2 systems, with some limitations (e.g. if rdf:reifies would be rejected by some old implementation since it wasn't defined in 1.1)).
  • I believe this would fit within the proposed RDF 1.2 basic profile.

Some notes:

  • The unstarred form will not be the same as classic reification. Those may be "upgraded" to reifiers of statement tokens in a backwards-compatible way, e.g. using OWL, or a simpler SPARQL construct. This is up to the dataset maintainer to decide. But I think it would be clear how old and new-style reification differ.
  • It may be roundtrippable. Detecting ill-formed triple terms (with other properties than its three constituents) may be advisable. (A principal problem in any kind of unstarring, and a core reason for adding triple terms. Skolemized bnodes and an opaque literal representation of the triple term is a way out, but it is not otherwise (very) usable, per the advantage points above.)
  • Using a skolemized form for all bnodes, and then something like urn:tdb:data IRIs (a resource addressed by the data) for the triple terms themselves (e.g. <urn:tdb:2014:data:application/n-triples,%3Chttp%3A//example.org/Alice%3E%20%3Chttp%3A//example.org/ns%23bought%3E%20%3Chttp%3A//example.org/LennyTheLion%3E> for the above triple), may have some benefits, e.g. ease the checking of well-formedness, and avoiding redundancy of duplicate blank nodes for the same triple term. (But I'm not sure if skolemization rhymes with C14N.)

@pchampin
Copy link
Contributor

pchampin commented Sep 26, 2024

To be clear, my proposal was the 2nd example of @gkellogg , namely the "surrogate" one (with the addition of a "special" namedgraph for identifying surrogate bnodes unambiguously.

More precisely:

  • let D be the dataset to unstar
  • if D contains a named graph named rdf:unstarMetadata and triple terms, then raise an error
  • for each triple term <<( S P O )>> occurring in D
    • mint a fresh blank node B
    • add in D a named graph named B containing a single triple S P O.
      (in other words, add the quad S P O B to the dataset D)
    • add the triple B rdf:type rdf:TripleTerl in the named graph named rdf:unstarMetadata of D,
      creating it if it does not exist
      (in other words, add the quad B rdf:type rdf:TripleTerm rdf:unstarMetadata in D)
    • replace all occurrences of <<( S P O )>> in D with B

@afs
Copy link
Contributor

afs commented Sep 26, 2024

The object of a triple term can itself be a triple term which will need altering in S P O B.

@TallTed
Copy link
Member

TallTed commented Sep 26, 2024

@pchampin

(in other words, add the quad S P O B to the dataset B)

Should that not be "to the dataset D"?

@pchampin
Copy link
Contributor

Should that not be "to the dataset D"?

yes of course, thanks. I fixed it.

@domel
Copy link

domel commented Sep 26, 2024

@pchampin Does what I show below accurately illustrate your proposition?
RDF 1.2

:alice :says << :bob :knows :charlie >> .

RDF 1.1

:alice :says _:b1 .

_:b1 {
    :bob :knows :charlie .
}

rdf:unstarMetadata {
    _:b1 rdf:type rdf:TripleTerm .
}

@gkellogg
Copy link
Member

If _:b1 is used with rdf:reifies, then the metadata block is arguably unnecessary, if the range of rdf:reifies is TripleTerm.

@afs
Copy link
Contributor

afs commented Sep 26, 2024

What are the advantage and disadvantages of graph vs datasets?

Dataset - can query for the triple as a triple ; needs the "admin" graph (or another way to distinguish these graphs).
Graph - works with graph-only context - RDF/XML; a graph in an existing dataset. Can't query directly for the triple; need to "see" the reification.

BTW rdf:ID in rdf/xml says the reification triples includes rdf:type rdf:Statement so if the rdf:type is rdf:Triple there is no confusion but also can't simply interpret an existing RDF/XML graph as RDF-star.

@pchampin
Copy link
Contributor

This was discussed during the rdf-star meeting on 26 September 2024.

View the transcript

Define an interpretation of Triple Terms 5

niklasl: I propose to use the rdf:subject/predicate/object properties for the interpetation

niklasl: This might not good to entail old reification from triple terms

<Zakim> pfps, you wanted to ask how this relates to the semantics from Enrico?

<niklasl> w3c/rdf-ucr#27

<gb> Issue 27 Integrating different ontology designs through entailment upon triple terms (by niklasl) [use case]

<pfps> Where are the semantics from Enrico, by the way?

ora: If we need some clarification on this, does it means we differ this one until Enrico shows up?

pchampin: Does this point to a specific use case or is this a nice to have,

<niklasl> https://gist.github.com/niklasl/69428b043be6f1d33fd45f89cbe52632#file-statement-entailment-ttl

niklasl: there are use cases I defined previously

AndyS: This seems to relate to the discussion around unstar

AndyS: Done this way I don't see how we can add multiple triples to the same reifier

tl: I have developed in a github issue multiple variants and they are all many-to-many

one is RDF-vocabulary based

<niklasl> Here is a comment on the unstarring issue I made yesterday w3c/rdf-star-wg#114 (comment) which relates to this issue about interpretation.

<gb> Issue 114 Un-star operation to support RDF Dataset Canonicalization? (by niklasl) [needs discussion] [discuss-f2f]

ora: I don't think we can reach a concensus here, is it a good discussion topic for next week after voting?

<Zakim> tl, you wanted to ask about when rdfs:states will be discussed

<gkellogg> JSON-LD-star slides – https://json-ld.github.io/w3c-tpac-2024-presentations/json-ld-star/

<AndyS> +1 to having the JSON-LD presentation in a focused meeting.

ora: Thank you everybody


@rat10
Copy link
Contributor

rat10 commented Oct 2, 2024

It really bothers me that a mapping to named graphs is discussed which would be constrained to singleton graphs. Singleton graphs do of course map more directly to triple terms. However, we have decided to base our approach on many-to-many reifiers and it seems like a real waste to not make use of the set-nature of named graphs to implement them directly, instead of mimicking triple terms.

I took up some aspects from various proposals above and mixed them a bit to see if they can be boiled down to a compact core. It turns out that triple terms and un-star mappings to RDF standard reification and RDF named graphs may even be mixed with each other, as the first set of examples illustrates, even if that may not be the intent of the proposed mappings (and I imagine that we might discuss to strictly forbid this).

# a mixed environment
:r1 rdf:reifies
    <<( :s :p :o1 )>> ,
    <<( :s :p :o2 )>> ,
    [ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o3 ] ,
    [ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o4 ] ,
    :g1 .
:g1 a rdf:Graph .
:g1 {
    :s :p :o5 , 
          :o6
}

This is equivalent to the following serializations as triple terms, named graph and RDF 1.0 standard reification (and of course any other combinations thereof):

# RDF 1.2 triple terms
:r1 rdf:reifies
    <<( :s :p :o1 )>> ,
    <<( :s :p :o2 )>> ,
    <<( :s :p :o3 )>> ,
    <<( :s :p :o4 )>> ,
    <<( :s :p :o5 )>> ,
    <<( :s :p :o6 )>> .
# RDF 1.1 named graphs 
:r1 rdf:reifies :g1 .
:g1 a rdf:Graph .
:g1 {
    :s :p :o1 ,
          :o2 ,
          :o3 ,
          :o4 , 
          :o5 ,
          :o6
}
# RDF 1.0 standard reification
:r1 rdf:reifies
    [ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o1 ] ,
    [ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o2 ] ,
    [ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o3 ] ,
    [ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o4 ] ,
    [ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o5 ] ,
    [ a rdf:Triple; rdf:subject :s ; rdf:predicate :p ; rdf:object :o6 ] .

It should be pointed out that identifiers in the rdfs:range of an rdf:reifies statement that are not of type rdf:Triple or rdf:Graph (or a triple term) don't add to the meaning of a reifier.

RDF 1.1. named graphs in the range of an rdf:reifies statement need to have a well specified semantics, e.g. that the name refers to the graph, that the graph so referenced doesn't contribute to the truth of the dataset, etc. Note that this semantics only applies to the reference in the reifying statements, with no effect on the semantics of eventual other references to the graph so named. The Nested Named Graph (NNG) proposal provides a sketch of a vocabulary to define such a semantics (among others).

Applying this to the running example:

# statements
<Alice> :bought <LennyTheLion> .
<LennyTheLion> a :Puppet .

# annotations on statement reifications
_:b0 rdf:reifies _:r;
    a :Purchase ;
    :seller :ToyStore ;
    :date "2024-06" .

_:b1 rdf:reifies _:r ;
    a :Purchase ;
    :seller :Market ;
    :date "2024-12" .

# 3 alternative serializations of reifications

## RDF 1.2 triple terms
_:r owl:sameAs  [
    owl:intersectionOf
        <<( <Alice> :bought <LennyTheLion> )>> ,
        <<( <LennyTheLion> a :Puppet )>> 
   ] .

## RDF 1.1 named graph
_:r owl:sameAs _:g .
_:g a rdf:Graph .
GRAPH _:g {
    <Alice> :bought <LennyTheLion> .
    <LennyTheLion> a :Puppet .
}

## RDF 1.0 standard reification
_:r owl:sameAs [
    owl:intersectionOf
        [ a rdf:triple ;
          rdf:subject <Alice> ;
          rdf:predicate  :bought ; 
          rdf:object <LennyTheLion> ] ,
        [ a rdf:triple ;
          rdf:subject <LennyTheLion> ;
          rdf:predicate  rdf:type ; 
          rdf:object :Puppet ] 
    ] .

Pllease note that the owl:sameAs construct only serves to present the three alternative serializations. I'm unsure if this use of owl:sameAs is somehow questionable, but that's another discussion.

@TallTed
Copy link
Member

TallTed commented Oct 2, 2024

[@rat10]

Please note that the owl:sameAs construct only serves to present the three alternative serializations. I'm unsure if this use of owl:sameAs is somehow questionable, but that's another discussion.

It may be another discussion, but I think that ignoring it will lead to problems in this discussion.

owl:sameAs basically says that "the thing in the subject position" can be swapped with "the thing in the object position". You can test whether you've used owl:sameAs correctly by performing that exercise -- e.g., you've said —

_:r owl:sameAs <<( <Alice> :bought <LennyTheLion> )>> ,
               <<( <LennyTheLion> a :Puppet )>> .

That is really two triples, being —

_:r owl:sameAs <<( <Alice> :bought <LennyTheLion> )>> .
_:r owl:sameAs <<( <LennyTheLion> a :Puppet )>> .

If you've used owl:sameAs correctly, those can be rewritten as —

<<( <LennyTheLion> a :Puppet )>> owl:sameAs <<( <Alice> :bought <LennyTheLion> )>> .

I don't think that's what you meant to say, and it only gets worse when you bring in the other owl:sameAs triples of your examples, so I don't think you've used owl:sameAs correctly, and I strongly suggest that you re-write your examples to leave this predicate out, as it can only confuse the discussion of the rest of your comment, which appears to me to have some merit.

@rat10
Copy link
Contributor

rat10 commented Oct 2, 2024

@TallTed Thanks, you're right, and I tried to correct the problem. If it makes the examples not only correct (I hope so) but also clearer is another question. It was probably a bad idea from the start to use owl:sameAs.

@pchampin
Copy link
Contributor

pchampin commented Oct 4, 2024

@pchampin Does what I show below accurately illustrate your proposition? RDF 1.2

:alice :says << :bob :knows :charlie >> .

RDF 1.1

:alice :says _:b1 .

_:b1 {
    :bob :knows :charlie .
}

rdf:unstarMetadata {
    _:b1 rdf:type rdf:TripleTerm .
}

@domel no, your example above uses the reifier as the name of the graph (remember that :alice :says << :bob :knows :charlie >> is short for :alice :says _:b1. _b1 rdf:reifies <<( :bob :know :charlie )>>..

The result of my proposed "unstar" mapping would be

:alice :says _:b1 .
_:b1 rdf:reifies _:b2.

_:b2 {
    :bob :knows :charlie .
}

rdf:unstarMetadata {
    _:b2 rdf:type rdf:TripleTerm .
}

edited to fix the mistake spotted by @gkellogg below

@gkellogg
Copy link
Member

gkellogg commented Oct 4, 2024

From my understanding, you have the bnodes confused. As :alice :says _:b1, and _:b1 rdf:reifies _:b2, and it is _:b2 which is the rdf:TripleTerm and the graph name. If I have that wrong, please describe the reasoning.

:alice :says _:b1 .
_:b1 rdf:reifies _:b2.

_:b2 {
    :bob :knows :charlie .
}

rdf:unstarMetadata {
    _:b2 rdf:type rdf:TripleTerm .
}

@gkellogg gkellogg removed the discuss-f2f Proposed for discussion during the next face-to-face meeting label Oct 24, 2024
@pfps
Copy link
Contributor

pfps commented Nov 13, 2024

RDF dataset canonicalization appears to me to be outside the scope of the working group so I'm not sure why the working group should be worried about it.

@rat10
Copy link
Contributor

rat10 commented Nov 13, 2024

@pfps In the CG you proposed an unstar mapping, worked on it through several iterations and claimed it to be useful to describe the relation of triple terms to RDF standard reification (a claim that I still agree with, and that I find even more valid since the introduction of "reifiers"). What has changed? Or have I misunderstood you all the time?

@pfps
Copy link
Contributor

pfps commented Nov 13, 2024

@rat10 Nothing has changed. The unstar mapping was useful with respect to the CG. RDF dataset canonicalization was not discussed in the CG, at least as far as I can remember.

@rat10
Copy link
Contributor

rat10 commented Nov 13, 2024

@pfps I think an unstar mapping has many useful applications, dataset canonicalization only being one of them. But to me dataset canonicalization is as good as any other application to discuss the unstar mapping. IMO we should do that, and I see no harm in using datatset canonicalization as the example use case.

@pfps
Copy link
Contributor

pfps commented Nov 13, 2024

This issue is about using unstar to support RDF dataset canonicalization. If there are other reasons for defining an unstar operation they should be discussed in a different issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants