-
Notifications
You must be signed in to change notification settings - Fork 30
Using the Linked Data module to access authorities
Table of contents
- Overview
- Prerequisites
- Accessing via QA
- Mount engine in routes
- List authorities
- Reload authorities
- Example search queries
- Example term fetch request
The linked data module provides access to many authorities using their linked data API. Common code is used to make requests through the external authorities linked data API and to process the linked data results that are returned. A configuration file drives describes the access URLs and how to interpret the data that is returned.
Authorities that fit best with this module follow some basic rules...
- search API URL allows a query string to be passed and returns results in some serialization of linked data
- While not required, for best processing, the search results should include a rank predicate that indicates the order of the returned results
- term fetch API URL allows an ID or URI to be passed and returns data about the term in some serialization of linked data
See also:
You will need to add the ruby-rdf/linkeddata gem that processes a large number of linked data serializations.
gem 'linkeddata'
NOTE: This gem is included in QA for development and testing of QA, but is not automatically included in the released gem.
To create a new authority, you will need to...
- identify search API URL that returns linked data
- identify term fetch API URL that returns linked data
- create configuration with the API URLs and how to interpret data results (See documentation on creating configurations.)
Place the new configuration in... config/authorities/linked_data
directory.
There are existing configurations to many commonly used authorities. You can get the configuration file and find more documentation about their linked data APIs at...
- dbPedia (2.0 config) (setup instructions)
- GeoNames (2.0 config) (setup instructions)
- Library of Congress (LOC) (2.0 config) (setup instructions) (NOTE: term fetch access only)
- NALT (Agricultural Thesaurus) (2.0 config) (setup instructions)
- OCLC FAST (2.0 config) (setup instructions)
To use, simply copy the configuration file and place it in... config/authorities/linked_data
directory.
NOTE: The list of configurations created by the LD4 series of grants is available at ld4p/linked_data_authorities. Any config with an extension of _DIRECT
goes directly against an external authority and can be used with any application. The others can serve as a starting point if you are creating your own cache of the authority's data.
Authority: | Use the name of the configuration file (e.g. oclc_fast.json has the authority name oclc_fast ) |
---|
Subauthorities:
Subauthorities, if supported, are defined in the configuraiton file. They are defined separately for search and term fetch.
For oclc_fast, the config file defines the search subauthorities as...
"subauthorities": {
"topic": "oclc.topic",
"geographic": "oclc.geographic",
"event_name": "oclc.eventName",
"personal_name": "oclc.personalName",
"corporate_name": "oclc.corporateName",
"uniform_title": "oclc.uniformTitle",
"period": "oclc.period",
"form": "oclc.form",
"alt_lc": "oclc.altlc"
}
NOTE: The key in the subauthorities hash is the subauthority name used in QA requests. The value in the subauthorities hash is the value required by the external authority's API. For more information on configuring subauthorities, see the configuration documentation.
The examples in this document include a starting path ENGINE_MOUNT
. The value of this is typically qa
or authorities
. It is defined in /config/routes.rb
.
mount Qa::Engine => '/authorities'
To list all currently loaded authorities:
/ENGINE_MOUNT/list/linked_data/authorities
NOTE: YOUR_AUTH_TOKEN is defined in config/initializers/qa.rb
as config.authorized_reload_token
.
If you add an authority to the directory holding authorities, it won't be picked up until there is a server restart. But you can force a reload without starting the server using the reload parameter with the reload_token.
/ENGINE_MOUNT/reload/linked_data/authorities?auth_token=YOUR_AUTH_TOKEN
NOTE: YOUR_AUTH_TOKEN is defined in config/initializers/qa.rb
as config.authorized_reload_token
.
The linked data module supports many different external linked data authorities. The OCLC FAST authority, which is included in QA, is being used in these examples to demonstrate how to search for terms using the linked data authority module in QA.
All authorities support the following parameters for search:
- lang - if supported, return literals tagged with this language + literals without language tags (optional) This can also be set through the
'HTTP_ACCEPT_LANGUAGE'
header - context - if
true
and if supported by the authority, additional context will be returned for each response in the query results - performance_data - if
true
, the response will include performance data along with the results - response_header - if
true
, metadata about the request and response will be included in the response along with the results - format - currently only supports
json
Configurations generally support the following parameters using a consistent naming scheme even if the external authority uses a different name for these parameters. The actual name of the parameter is defined in the configuration. See qa_replacement_patterns
in the example configuration.
- q - the string query
- maxRecords - limit the number of returned records (optional)
/ENGINE_MOUNT/search/linked_data/oclc_fast?q=twain&maximumRecords=2
Result:
[
{"uri":"http://id.worldcat.org/fast/1914919","id":"1914919","label":"Life on the Mississippi (Twain, Mark)"},
{"uri":"http://id.worldcat.org/fast/1796341","id":"1796341","label":"Works (Twain, Mark)"}
]
NOTE: The qa request is converted to the following OCLC FAST request. This is for information only. You do not need to know this to use QA.
http://experimental.worldcat.org/fast/search?query=cql.any+all+%22twain%22&sortKeys=usage&maximumRecords=2
/ENGINE_MOUNT/search/linked_data/oclc_fast/personal_name?q=twain&maximumRecords=2
Result:
[
{"uri":"http://id.worldcat.org/fast/1580187","id":"1580187","label":"Braden, Twain"},
{"uri":"http://id.worldcat.org/fast/365563","id":"365563","label":"Twain, Shania"}
]
NOTE: The qa request is converted to the following OCLC FAST request. This is for information only. You do not need to know this to use QA.
http://experimental.worldcat.org/fast/search?query=oclc.personalName+all+%22twain%22&sortKeys=usage&maximumRecords=2
The linked data module supports many different external linked data authorities. The LOC authority, which is included in QA, is being used in these examples to demonstrate how to fetch a term using the linked data authority module in QA.
Some authorities only support fetching by id, while others only support fetching by URI. There are two APIs provided by QA to allow for fetch by ID (i.e. show/{id}
) or URI (i.e. fetch?uri={uri}
).
The term fetch supports the following standard parameter for all requests:
- format = json | jsonld - An example of each follows. The notes under the example results provide an explanation the results based on the value of this parameter.
/ENGINE_MOUNT/show/linked_data/loc/subjects/sh85076841
Result:
{
"uri":"http://id.loc.gov/authorities/subjects/sh85076841",
"id":"sh 85076841",
"label":["Life sciences"],
"altlabel":["Biosciences","Sciences, Life"],
"narrower":["http://id.loc.gov/authorities/subjects/sh85083022","http://id.loc.gov/authorities/subjects/sh85002415",etc.],
"broader":["http://id.loc.gov/authorities/subjects/sh00007934"],
"sameas":[""],
"predicates":{
"http://www.loc.gov/mads/rdf/v1#hasCloseExternalAuthority":["http://id.worldcat.org/fast/998323","http://data.bnf.fr/ark:/12148/cb119716335",etc.],
"http://www.loc.gov/mads/rdf/v1#isMemberOfMADSCollection":["http://id.loc.gov/authorities/subjects/collection_SubdivideGeographically","http://id.loc.gov/authorities/subjects/collection_LCSH_General",etc.],
"http://www.loc.gov/mads/rdf/v1#isMemberOfMADSScheme":["http://id.loc.gov/authorities/subjects"],
"http://www.w3.org/2008/05/skos-xl#altLabel":["Biosciences","Sciences, Life"],
etc.}
}
NOTE: The results when requesting json are normalized based on definitions in the configuration file for the LOC authority. Using format=json
provides apps with a normalized set of results that are easier to process by the consuming app. The results will include all parts of the graph returned by the external authority that have the result URI as the subject URI of the triples. Extended data that have different subject URIs are not part of the results returned by QA in the json format.
NOTE: The qa request is converted to the following LOC request. This is for information only. You do not need to know this to use QA.
http://id.loc.gov/authorities/subjects/sh85076841
/ENGINE_MOUNT/show/linked_data/loc/subjects/sh85076841
Result:
{
"@context": {
"mads": "http://www.loc.gov/mads/rdf/v1#",
"skos": "http://www.w3.org/2004/02/skos/core#",
"skosxl": "http://www.w3.org/2008/05/skos-xl#",
"identifiers": "http://id.loc.gov/vocabulary/identifiers/",
"owl": "http://www.w3.org/2002/07/owl#",
"xsd": "http://www.w3.org/2001/XMLSchema#"
},
"@graph": [
{
"@id": "http://id.worldcat.org/fast/998327",
"skos:prefLabel": "Life sciences--Computer programs",
"@type": [
"mads:Authority",
"skos:Concept"
],
"mads:authoritativeLabel": "Life sciences--Computer programs"
},
{
"@id": "http://id.worldcat.org/fast/998329",
"skos:prefLabel": "Life sciences--Data processing",
"@type": [
"mads:Authority",
"skos:Concept"
],
"mads:authoritativeLabel": "Life sciences--Data processing"
},
{
"@id": "http://id.worldcat.org/fast/998325",
"skos:prefLabel": "Life sciences--Authorship",
"@type": [
"mads:Authority",
"skos:Concept"
],
"mads:authoritativeLabel": "Life sciences--Authorship"
},
etc.
]
}
NOTE: The results when requesting json-ld are the full graph as it is returned from the external authority request. This will include extended triples that have a subject URI different from the URI of the fetched term.
NOTE: The qa request is converted to the following LOC request. This is for information only. You do not need to know this to use QA.
http://id.loc.gov/authorities/subjects/sh85076841
To run this example, copy the dbpedia_direct.json configuration to /config/authorities/linked_data/
and restart rails server.
/ENGINE_MOUNT/fetch/linked_data/dbpedia_direct?uri=http://dbpedia.org/resource/Herbert_F._Johnson_Museum_of_Art
Result:
{
"uri":"http://dbpedia.org/resource/Herbert_F._Johnson_Museum_of_Art",
"id":"http://dbpedia.org/resource/Herbert_F._Johnson_Museum_of_Art",
"label":["Herbert F. Johnson Museum of Art"],
"altlabel":[],
"narrower":[""],
"broader":[""],
"sameas":[""],
"predicates":{
"http://purl.org/dc/terms/subject":["http://dbpedia.org/resource/Category:Art_museums_in_New_York","http://dbpedia.org/resource/Category:University_art_museums_and_galleries_in_New_York",etc.],
"http://www.w3.org/2003/01/geo/wgs84_pos#geometry":["POINT(-76.486465454102 42.450839996338)"],
"http://xmlns.com/foaf/0.1/homepage":["http://museum.cornell.edu/"],
"http://dbpedia.org/ontology/thumbnail":["http://commons.wikimedia.org/wiki/Special:FilePath/Johnson-museum-of-art-cornell.JPG?width=300"],
etc.
}
}
NOTE: The qa request is converted to the following LOC request. This is for information only. You do not need to know this to use QA.
http://dbpedia.org/resource/Herbert_F._Johnson_Museum_of_Art?locale=en
Not supported
Each authority has its own documentation. The LD4 series of grants has created a resource that has links to many authorities that support access via linked data APIs. See ld4p/linked_data_authorities for more information.
Using Questioning Authority
- Connecting to Discogs
- Connecting to GeoNames
- Connecting to Getty
- Connecting to Library of Congress (LOC)
- Connecting to Medical Subject Headings (MeSH)
- Connecting to OCLC FAST
Custom Controlled Vocabularies
Linked Data Access to Authorities
- Connecting to Linked Data authorities
- Using the Linked Data module to access authorities
- Configuring access to a Linked Data authority
- Language processing in Linked Data authorities
Contributing to Questioning Authority
- Contributing a new external authority
- Template for authority documentation
- Understanding Existing Authorities