XML
data.world supports the upload of data in the XML data format which not only allows for uploading information stored in XML files, but also enables data to be synchronized from web services which return their data as XML. In this way, data from web services can be queried using SPARQL
For all XML files, we apply a mapping from XML to RDF. data.world creates rdf triples from each XML document in the namespace https://xml2rf.data.world/
(abbreviated as x2r:
). Every XML document is an instance of x2r:Document
. The filename of the document is referenced as the dwo:name
of the instance. The dwo:name
is the name predicate for the data.world ontology. Its full unprefixed form is https://ontology.data.world/v0#name
. There is also a triple with the predicate x2r:topLevel
that points to the top-level element in the document.
Components in the XML document are represented as either:
x2r:Element
instances - These map to XML elements.x2r:TextNode
instances - These map to XML text.Blank nodes representing XML attributes
For each element, data.world generates triples for:
The local name of the tag of the element using predicate
x2r:tag
The namespace of the tag of the element using predicate
x2r:xmlns
A pointer to the element’s parent using predicate
x2r:parent
A pointer to the top-level document containing the element using predicate
x2r:containedIn
RDF lists by tag of child elements contained in the element
An RDF list of the child text nodes contained in the element
For each text node data.world generates triples for:
A pointer to the text node’s parent using predicate
x2r:parent
A pointer to the top-level document containing the text node using predicate
x2r:containedIn
The textual content of the text node using predicate
x2r:content
For each attribute data.world generates triples for:
The local name of the tag of the attribute using predicate
x2r:tag
The namespace of the tag of the attribute using predicate
x2r:xmlns
The value of the attribute using predicate
x2r:value
This mapping structure allows for a quite idiomatic model for querying against XML documents in SPARQL. All data in the XML model is available for querying directly, although the queries may be somewhat verbose.
Example - XML
The following example illustrates the use of the data.world generated triples to get data out of XML:
PREFIX : <https://ddw-doccorp.linked.data.world/d/sparql-xml-dataset/> PREFIX x2r: <https://xml2rdf.data.world/> SELECT ?abbreviation ?passengers { ?airportNode a x2r:Element. ?airportNode x2r:tag "airport". ?abbreviationElement x2r:parent ?airportNode. ?abbreviationElement x2r:tag "abbreviation". ?abbreviationNode x2r:parent ?abbreviationElement. ?abbreviationNode x2r:content ?abbreviation. ?passengersElement x2r:parent ?airportNode. ?passengersElement x2r:tag "passengers". ?passengersNode x2r:parent ?passengersElement. ?passengersNode x2r:content ?passengers. }
Here is what the query looks liike when run on data.world: