JSON
data.world supports the upload of data in the JSON data format. Files with the JSON data format can be either .json
files with the standard JSON serialization, or .yaml
files using the alternative YAML serialization. In addition to allowing for uploading information stored in JSON files, storing the files in JSON format allows for data to be synchronized from web services that return their data as JSON. In this way data from web services can be queried using either SQL or SPARQL.
Where possible, data.world will attempt to infer a tabular structure from JSON data, and will create triples based on that structure. For JSON data which is fairly regular and shallow (like many log files), mapping to a tabular structure is very handy. But it can become unwieldy for more deeply nested and irregular JSON structures. For those, the tabular JSON mapping is lossy, flattening deeper structures of JSON into simple strings.
For all JSON files (tabular or not), there is also another more powerful direct mapping performed that is better for the more deeply nested and irregular structures. Using the https://json2rdf.data.world/
namespace (abbreviated as j2r:
), data.world creates rdf triples from each JSON document as follows:
Every JSON document is an instance of
j2r:Document
(i.e, there is a triple of the formdocumentIri a j2r:Document
wheredocumentIri
represents a placeholder for the IRI of the document.The filename of the document is referenced as the
dwo:name
of the instance (i.e, there is a triple for the formdocumentIri dwo:name “document name”
).A triple with the predicate
j2r:topLevel
points to the top-level element in the document. This triple looks likedocumentIri j2r:topLevel elementIri
.
Elements in the JSON document are represented in RDF as either:
j2r:Object instances
, which map to JSON objects (maps of key-value pairs within{}
)Each
j2r:Object instance
will have a tripleelementIri a j2r:Object
.
RDF Lists, which map to JSON arrays (lists of objects within
[]
RDF Literals, which map to literals as values in arrays or object values.
Every key in a JSON object gets defined as an rdf:Property
. Those properties are defined in a child namespace of the namespace for the dataset or project by appending json-terms/
to the end of dataset or project namespace. The child namespace is commonly abbreviated as j:
. What this means is that for each key in an object, there will be a triple objectiri j:termname keyValue
. The key value may be either a literal, a list, or the iri of a contained object.
There is also a property j2r:term
defined on the rdf:Property
. which points to the string value of the JSON key. That is to say, there is a triple defined as j:termname j2r:term termname
. This triple enables dynamic term discovery for maps with open models.
Note
Note: data.world coins IRIs for each JSON element to guarantee uniqueness, but does not otherwise guarantee any particular structure of that IRI. The IRIs should be used for identity only and not parsed for their constituent structure.
The structure of data.world's direct mapping allows for a quite idiomatic model for querying against JSON documents in SPARQL. All data in the JSON model is available for querying directly, and not flattened or lossy in any way.
Examples - JSON
Here is an example of a query showing how to access data using triples generated for JSON documents:
PREFIX : <https://ddw-doccorp.linked.data.world/d/sparql-json-dataset/> PREFIX json2rdf: <https://json2rdf.data.world/> PREFIX jsonterm: <https://ddw-doccorp.linked.data.world/d/sparql-json-dataset/json-terms/> SELECT ?abbreviation ?elevation { ?abbrp a json2rdf:Property. ?abbrp json2rdf:term "abbreviation". ?q ?abbrp ?abbreviation. ?elevp a json2rdf:Property. ?elevp json2rdf:term "elevation". ?q ?elevp ?elevation. }
This is what the query looks like when run on data.world:
Here is an example of a query which shows how to access data using triples generated for tabular JSON documents:
PREFIX : <https://ddw-doccorp.linked.data.world/d/sparql-json-dataset/> SELECT ?abbreviation ?elevation { ?q a :tbl-elevations. ?q :col-elevations-abbreviation ?abbreviation. ?q :col-elevations-elevation ?elevation. }
This is what the query looks like when run on data.world: