Skip to main content

SPARQLing data.world

Introduction

SPARQL (pronounced "sparkle") is a powerful query language used to retrieve, modify and make the best use out of linked data. It is recognized as one of the key technologies of the semantic web due to its flexibility as well as the ease of joining complex data structures and detecting intricate patterns in data. It is also the query language upon which the data.world platform is based.

The purpose of this document is to enable technical users who are familiar with SPARQL to use the full range of SPARQL functionality on data.world.

While SPARQL provides a lovely language with which to query your data, you need to know more than basic SPARQL in order to maximize your value from data.world. data.world SPARQL significantly extends the functionality of SPARQL with:

  • Specialized access to your data with named graph patterns

  • Extensions to the SPARQL language itself:

    • DIAGNOSE

    • WITH

    • Query-as-graph

  • Dozens of extension aggregations and functions with which to manipulate your data

New to SPARQL?

If you are new to SPARQL, here are some resources to help you get started:

What is covered

In this guide we will cover the following:

  • The basics of how queries work in SPARQL

  • The file types that can be queried and how RDF is generated for them (including how IRIs are generated, as appropriate)

  • Querying from virtualized data sources

  • Refining data access

  • data.world extensions to SPARQL

  • data.world enhancements to your data

Sample data

We have created a set of resources with sample data to use in this guide. The resources consist of a project, several datasets, and innumerable queries. Here is a list of the project and dataset resources and descriptions of the data in them:

Important

All the links in the following examples open projects in the data.world open data community. For best experience, make sure you are logged in to the application when you click these links.

Table 1. Sample data for SPARQL

Type

Name

Data

project

sparql-project

Contains the datasets:

  • sparql-csv-dataset

  • sparql-json-dataset

  • sparql-rdf-dataset

  • sparql-xml-dataset

  • sparql-tabular-dataset

dataset

sparql-with

Includes the files:

  • airport.ttl - Airport codes, names, and nicknames

  • cities.csv - Airport names, cities, and states

dataset

sparql-tabular-dataset

Includes the files:

  • cities.csv - Airport names, cities, and states

  • airport_data.xslx

    • cities (sheet) - Airports names and codes

    • hubs (sheet) - Airport codes, airline hubs

dataset

sparql-rdf-dataset

Includes the file:

  • airport.ttl - Airport codes, names, and nicknames

dataset

sparql-virtualized

Includes a live connection to the Snowflake database table:

  • airports - Abbreviation (airport code) and number of runways

dataset

sparql-xml-dataset

Includes the file:

  • passengers.xml - Abbreviation (airport code) and number of passengers

dataset

sparql-shacl

Include the files:

  • airport.ttl

  • shacl.ttl

dataset

sparql-json-dataset

Includes the following file:

  • elevations.json - Abbreviation (airport code) and elevation

dataset

ontology-sample-queries

Includes the following file:

  • sample.csv - A table with two columns of random numerical data

dataset

sparql-aggregation-sample-queries

Includes the following file:

  • aggregations.csv - A table with two columns of random numerical data

dataset

sparql-function-sample-queries

This dataset contains only queries