An Overview of the JPS-RDF Schema: Data that is obtainable from JPS SPARQL endpoints
1. URIs of description information and obtainable data formats
Japan Search promotes the utilization of registered metadata by converting metadata about various content that has been aggregated in a variety of formats into a common format, which is then provided to users as linked open data based on the Resource Description Framework (RDF). This common format is called the JPS schema (JPS-RDF), and the data formatted this way is called JPS-RDF data.
The set of JPS-RDF data associated with a particular object is called description information. A unique URI starting with “https://jpsearch.go.jp/data/” is given to the description information of each object registered in Japan Search.
The description information of an individual object can be browsed in HTML format by using a browser to request the URI of the description information. Additionally, file name extensions can be added to the URI to request a specific format, such as JSON-LD, RDF/XML, or Turtle, as follows.
- https://jpsearch.go.jp/data/photo-00010_00558_0006 (Responding formats vary depending on the request)
- https://jpsearch.go.jp/data/photo-00010_00558_0006.json (JSON-LD)
- https://jpsearch.go.jp/data/photo-00010_00558_0006.jsonld (JSON-LD)
- https://jpsearch.go.jp/data/photo-00010_00558_0006.rdf (RDF/XML)
- https://jpsearch.go.jp/data/photo-00010_00558_0006.xml (RDF/XML)
- https://jpsearch.go.jp/data/photo-00010_00558_0006.ttl (Turtle)
* When metadata is retrieved in Turtle format, non-ASCII characters included in IRIs are encoded as Unicode characters.
SPARQL is the standard language for using RDF data. By processing queries via a SPARQL endpoint, external applications are able to search and retrieve JPS-RDF data. Search results can be obtained in JSON, XML, Turtle, or similar formats. The EasySPARQL interface makes it convenient to use SPARQL from a browser.
- Overview of JPS SPARQL Endpoint (in Japanese)
- Japan Search RDF Model Primer (unofficial) (External site: The Web KANZAKI)
- SPARQL Endpoint
- EasySPARQL
2. Data conversion and normalization
All metadata aggregated from providers who permit Japan Search to provide their metadata via an API are converted into JPS-RDF. As of February 2022, 140 of the 160 databases from which Japan Search aggregates metadata allow us to do so, and the triple store now contains approximately 1.08 billion triples.
During conversion to JPS-RDF, the values “time (when),” “location (where),” and “person or organization name (who)” are normalized as much as possible. This makes it possible for users to obtain highly comprehensive search results for precise conditions. All of these values are given URIs, and those that can be normalized are linked to the LOD hubs (URIs that are often linked from other data on the web), so that the data on Japan Search is easily referenced from other data on the web.
Values for content type (painting, sculpture, specimen, etc.) and the subject of the work are determined from information contained in the source data. Wherever possible, we link these values to URIs of existing classification systems in order to expand opportunities for discovering content.
- Normalization in JPS-RDF (Time, Place, Name)
- Content Types (Classes) at Japan Search
- Japan Search Normalized Name Index (External site: The Web KANZAKI)
3. The JPS-RDF data model
TThe JPS-RDF data model has two parts: content description and access and source information. The content description has two layers: a simple description and a structured description. The access and source information also has two layers: the first layer is access information, or information necessary to use the content; the second consists of source data, or information on the provided metadata, and source information, or information on the metadata provider. Thus, this data model consists of four layers of properties.
- 3-1. Simple description properties
- 3-2. Structured description properties
- 3-3. Access information properties
- 3-4. Source information properties
3-1. Simple description properties
There are 23 simple descriptive properties, which provide information that is widely used to search for content, such as names, languages, and subjects. The simple description properties use the vocabulary of schema.org, which is used for markup of content on the web, and aims to be easy to understand for a wide range of users.
- rdf:type
- rdfs:label
- schema:name
- schema:contributor
- schema:creator
- schema:publisher
- schema:temporal
- schema:dateCreated
- schema:datePublished
- schema:spatial
- schema:about
- schema:category
- schema:identifier
- schema:isbn
- schema:issn
- schema:inLanguage
- schema:image
- schema:description
- schema:isPartOf
- schema:hasPart
- schema:relatedLink
- schema:exampleOfWork
- schema:workPerformed
3-2. Structured description properties
Information related to the object about time (when), location (where), and person/organization name (who) as well as “part of” information, which appears when the object forms part of other object Information, is provided in a structured way. This structuring expresses the information by breaking it down into multiple elements.
For example, if the information is about person, the person’s name and its romanization, information about the person’s involvement in the object (e.g. director or cast), and other related information are provided in separate properties. These are the elements of information about the person. Detailed information is useful when identifying the required content from search results or when filtering by complex conditions.
There are five structured description properties in the JPS-RDF. They are defined in the Japan Search property namespace. The vocabulary prefix that we have defined is jps:
. (The existing vocabulary such as schema.org is also used for sub-properties.)
3-3. Access information properties
In the access information property, information related to retrieval and use of content is provided in a structured form. In this context, content refers to individual cultural and academic information resources. In addition to digital content, it also includes analog materials and works before digitization. For example, in addition to the name of the institution that holds a particular painting, the access information will also include the URL of digitized images of the painting.
3-4. Source information properties
The source information about the metadata provided to Japan Search and the URLs of the original metadata are provided in a structured form. The source information includes the URL of the system that provided the metadata to Japan Search, the URL of source data stored within the Japan Search system, or other relevant data.
4. Name Spaces
Japan Search defined its own name spaces for original properties.
Vocabulary Title | Namespace Name | Prefix |
---|---|---|
JPS-RDF properties | https://jpsearch.go.jp/term/property# |
jps:
|
We also use following existing vocabularies.
Vocabulary Title | Namespace Name | Prefix |
---|---|---|
OWL Web Ontology Language | http://www.w3.org/2002/07/owl# |
owl:
|
RDF Vocabulary | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdf:
|
RDF Schema | http://www.w3.org/2000/01/rdf-schema# |
rdfs:
|
SKOS | http://www.w3.org/2004/02/skos/core# |
skos:
|
Schema.org Vocabulary | http://schema.org/ |
schema:
|
5. References
- JPS-RDF Schema (Written in Japanese)