Canonicalize and dedup URLs in to_rb #61
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
JSON::LD::Context#parse will only look in the PRELOADED hash with a fully canonicalized URL including replacing https with http. This means that any preloads or aliases under non-canonicalized names won't work and can't be used and will just waste memory.
This commit fully canonicalizes both the base and alias URLs (including changing https to http) and removes any duplicates.
I considered adding error checking or canonicalization in
add_preloaded
andalias_preloaded
as well, but I'll propose that separately as compatibility is tricky.See also ruby-rdf/json-ld-preloaded#7 where this is applied, and it ends up removing all aliases (they were unnecessary) and fixing a preload which previously was not working at all (attempting to use it would make the HTTP request).