Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove base dot segments in RFC3986 tests. #525

Closed
wants to merge 1 commit into from
Closed

Conversation

davidlehn
Copy link
Member

Some of the RFC3986 tests don't remove dot segments of the base URI. The few libraries I tested do do this processing. It's hard to tell but it might be optional in the spec. It does seem more understandable to me to always remove all dot segments from the full output path.

@gkellogg
Copy link
Member

I'll need to go back and re-check, but this test is based on others made for Turtle, RDFa and other processors, originally submitted to rdf-tests by @RubenVerborgh. See issue w3c/rdf-tests#6 for the background. Also, see the email in rdf-comments from August 2015. Basically, it indicates that URI normalization should not be applied in most cases, thus the tests including the dot segments. The algorithm in 6.3 IRI Expansion is essentially the same as used in Turtle and other RDF parsers, so I believe the same reasoning should apply. We callout the fact that normalization should not occur.

Otherwise, if document relative is true, set value to the result of resolving value against the base IRI. Only the basic algorithm in section 5.2 of [RFC3986] is used; neither Syntax-Based Normalization nor Scheme-Based Normalization are performed.

@davidlehn
Copy link
Member Author

Well, isn't that all confusing. Seems it would have been easiest to just declare dot segments in base URIs as undefined behavior and not bother with tests for it.

Are there URI resolving libraries that use this behavior or do all json-ld and other rdf and similar parsers just implement custom code for this? Does anyone have suggestions for code to look at so we can update the custom resolvers in js, python, and php libs? I'm guessing the algorithm has to initialize the output with the unprocessed base path, then start the dot segment removal process with the relative path as input.

Unfortunate those tests didn't exist from the beginning. It's entirely possible the digital bazaar implementations have created improper output due to this. But who uses dot segments in base URIs anyway?

@gkellogg
Copy link
Member

My code if fairly straight-forward. The key, for me, was not to join against base if it's already an absolute IRI.

The URI.join code is in RDF::URI#join.

Testing against the margins is what it takes to ensure compatibility, which is why they were added for the other formats.

@davidlehn
Copy link
Member Author

I updated the jsonld.js code to handle the tests as-is. Will update python and php libs at some point. Closing.

@davidlehn davidlehn closed this Jul 25, 2017
@gkellogg gkellogg deleted the dot-segments branch June 1, 2022 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants