日本語

An Overview of the JPS-RDF Schema: Data that is obtainable from JPS SPARQL endpoints

1. URIs of description information and obtainable data formats

Japan Search promotes the utilization of registered metadata by converting metadata about various content that has been aggregated in a variety of formats into a common format, which is then provided to users as linked open data based on the Resource Description Framework (RDF). This common format is called the JPS schema (JPS-RDF), and the data formatted this way is called JPS-RDF data.

The set of JPS-RDF data associated with a particular object is called description information. A unique URI starting with “https://jpsearch.go.jp/data/” is given to the description information of each object registered in Japan Search.

The description information of an individual object can be browsed in HTML format by using a browser to request the URI of the description information. Additionally, file name extensions can be added to the URI to request a specific format, such as JSON-LD, RDF/XML, or Turtle, as follows.

SPARQL is the standard language for using linked data and can be used to query the SPARQL endpoint as a means of searching and retrieving JPS-RDF data using an external application. Search results can be obtained in JSON, XML, Turtle, or similar formats. The EasySPARQL interface makes it convenient to use SPARQL from a browser.

2. Data conversion and normalization

All metadata aggregated from providers who permit Japan Search to provide their metadata via an API are converted into JPS-RDF. As of September 2020, 45 of the 111 databases from which Japan Search aggregates metadata allow us to do so, and the triple store now contains approximately 950 million triples.

During conversion to JPS-RDF, the values “time (when),” “location (where),” and “person or organization name (who)” are normalized as much as possible. This makes it possible for users to obtain highly comprehensive search results for precise conditions. All of these values are given URIs, and those that can be normalized are linked to the LOD hubs (URIs that are often linked from other data on the web), so that the data on Japan Search is easily referenced from other data on the web.

Values for content type (painting, sculpture, specimen, etc.) and the subject of the work are determined from information contained in the source data. Wherever possible, we link these values to URIs of existing classification systems in order to expand opportunities for discovering content.

3. The JPS-RDF data model

The JPS-RDF data model has two parts: the content description and the access and source information. The content description has two layers: a simple description and a structured description. The access and source information also has two layers: the information necessary to use the content (access information) and the name of the database (source information) that provided the metadata to Japan Search. Thus, this data model consists of four layers of properties.

  1. 3-1. Simple description properties
  2. 3-2. Structured description properties
  3. 3-3. Access information properties
  4. 3-4. Source information properties

3-1. Simple description properties

There are 20 simple descriptive properties, which provide information that is widely used to search for content, such as names, languages, and subjects. The simple description properties use the vocabulary of schema.org, which is used for markup of content on the web, and aims to be easy to understand for a wide range of users.

With the addition of databases, the vocabulary provided by schema.org may be used as auxiliary properties.

3-2. Structured description properties

Information related to the object about time (when), location (where), and person/organization name (who) as well as “part of” information, which appears when the object forms part of other object Information, is provided in a structured way. This structuring expresses the information by breaking it down into multiple elements.

For example, if the information is about person, the person’s name and its romanization, information about the person’s involvement in the object (e.g. director or cast), and other related information are provided in separate properties. These are the elements of information about the person. Detailed information is useful when identifying the required content from search results or when filtering by complex conditions.

There are five structured description properties in the JPS-RDF. They are defined in the Japan Search property namespace. The vocabulary prefix that we have defined is jps:. (The existing vocabulary such as schema.org is also used for sub-properties.)

3-3. Access information properties

In the access information property, information related to retrieval and use of content is provided in a structured form. In this context, content refers to individual cultural and academic information resources. In addition to digital content, it also includes analog materials and works before digitization. For example, in addition to the name of the institution that holds a particular painting, the access information will also include the URL of digitized images of the painting.

3-4. Source information properties

The source information about the metadata provided to Japan Search and the URLs of the original metadata are provided in a structured form. The source information includes the URL of the system, that provided the metadata to Japan Search, the URL of source data stored within the Japan Search system or other relevant data.

4. Name Spaces

Japan Search defined its own name spaces for original properties.

Vocabulary Title Namespace Name Prefix
JPS-RDF properties https://jpsearch.go.jp/term/property# jps:

We also use following existing vocabularies.

Vocabulary Title Namespace Name Prefix
OWL Web Ontology Language http://www.w3.org/2002/07/owl# owl:
RDF Vocabulary http://www.w3.org/1999/02/22-rdf-syntax-ns# rdf:
RDF Schema http://www.w3.org/2000/01/rdf-schema# rdfs:
SKOS http://www.w3.org/2004/02/skos/core# skos:
Schema.org Vocabulary http://schema.org/ schema:

5. References