Skip to main content

Virtualized data sources

In addition to adding data files directly to data.world, you can also connect to and query remote relational databases using our virtualization technology. Even though the underlying databases are relational and speak SQL, you can use SPARQL to query them, and you can join in data from other data sources (almost) transparently.  Indeed, it is only slightly more difficult to query your virtualized relational databases with SPARQL than it is to query data you have uploaded as files. Instructions for setting up a virtualized connection can be found in our section on using the Connection Manager.

Note

For technical reasons, data from virtualized databases is not available in the default graph of a dataset.  Instead, it is only available in the :mapped graph.

To query against the RDF triples from the virtualized source, you’ll need to use FROM or FROM NAMED in your SPARQL query, and use the :mapped IRI to specify that you want the virtualized triples.

Example - Virtualized

Here is an example of a SPARQL query written against a live table in a relational database outside of data.world.

PREFIX : <https://ddw-doccorp.linked.data.world/d/sparql-virtualized/>

SELECT ?abbreviation ?runways
FROM NAMED :mapped 
{
    GRAPH :mapped {
        ?s :col-airports-abbreviation ?abbreviation.
        ?s :col-airports-runways      ?runways.
    }
}

Run the query

This is what the query looks like when run on data.world. Notice the mappings required for specifying the live table in the dataset:

Querying_virtualized_data.png