Skip to main content

Python

Data.world has developed an open source Python library for working with data.world datasets.

This library makes it easy for data.world users to pull and work with data stored on data.world. Additionally, the library provides convenient wrappers for data.world APIs, allowing users to create and update datasets, add and modify files, and possibly implement entire apps on top of data.world.

Installation

You can install data.world Python library using pip directly from PyPI:

pip install datadotworld

Optionally, you can install the library including pandas support:

pip install datadotworld[pandas]

If you use conda to manage your python distribution, you can install from the community-maintained [conda-forge](https://conda-forge.github.io/) channel:

conda install -c conda-forge datadotworld-py

Configuration

This library requires a data.world API authentication token to work.

Your authentication token can be obtained on data.world once you enable Python under Integrations > Python

To configure the library, run the following command:

dw configure

Alternatively, tokens can be provided via the DW_AUTH_TOKEN environment variable. On MacOS or Unix machines, run (replacing <YOUR_TOKEN>> below with the token obtained earlier):

export DW_AUTH_TOKEN=<YOUR_TOKEN>

Define content for SDK host

If you want to connect to the public data.world api server you will use the default settings. However if you want to connect to a private data.world environment (e.g., your_org.data.world), you can use environment variables to define the host:

def create_url(subdomain, environment):
    if environment:
        subdomain = subdomain + '.' + environment

    return 'https://{}.data.world'.format(subdomain)

DW_ENVIRONMENT = environ.get('DW_ENVIRONMENT', '')
API_HOST = environ.get('DW_API_HOST', create_url('api', DW_ENVIRONMENT))
DOWNLOAD_HOST = environ.get(
    'DW_DOWNLOAD_HOST', create_url('download', DW_ENVIRONMENT))
QUERY_HOST = environ.get('DW_QUERY_HOST', create_url('query', DW_ENVIRONMENT))

Examples

The load_dataset() function facilitates maintaining copies of datasets on the local filesystem. It will download a given dataset's datapackage and store it under ~/.dw/cache. When used subsequently, load_dataset() will use the copy stored on disk and will work offline, unless it's called with force_update=True or auto_update=True. force_update=True will overwrite your local copy unconditionally. auto_update=True will only overwrite your local copy if a newer version of the dataset is available on data.world.

Once loaded, a dataset (data and metadata) can be conveniently accessed via the object returned by load_dataset().

Start by importing the datadotworld module:

import datadotworld as dw

Then, invoke the load_dataset() function, to download a dataset and work with it locally. For example:

intro_dataset = dw.load_dataset('jonloyens/an-intro-to-dataworld-dataset')

Dataset objects allow access to data via three different properties raw_data, tables and dataframes. Each of these properties is a mapping (dict) whose values are of type bytes, list and pandas.DataFrame, respectively. Values are lazy loaded and cached once loaded. Their keys are the names of the files contained in the dataset.

For example:

>>> intro_dataset.dataframes
LazyLoadedDict({
    'changelog': LazyLoadedValue(<pandas.DataFrame>),
    'datadotworldbballstats': LazyLoadedValue(<pandas.DataFrame>),
    'datadotworldbballteam': LazyLoadedValue(<pandas.DataFrame>)})

IMPORTANT: Not all files in a dataset are tabular, therefore some will be exposed via raw_data only.

Tables are lists of rows, each represented by a mapping (dict) of column names to their respective values.

For example:

>>> stats_table = intro_dataset.tables['datadotworldbballstats']
>>> stats_table[0]
OrderedDict([('Name', 'Jon'),
             ('PointsPerGame', Decimal('20.4')),
             ('AssistsPerGame', Decimal('1.3'))])

You can also review the metadata associated with a file or the entire dataset, using the describe function. For example:

>>> intro_dataset.describe()
{'homepage': 'https://data.world/jonloyens/an-intro-to-dataworld-dataset',
 'name': 'jonloyens_an-intro-to-dataworld-dataset',
 'resources': [{'format': 'csv',
   'name': 'changelog',
   'path': 'data/ChangeLog.csv'},
  {'format': 'csv',
   'name': 'datadotworldbballstats',
   'path': 'data/DataDotWorldBBallStats.csv'},
  {'format': 'csv',
   'name': 'datadotworldbballteam',
   'path': 'data/DataDotWorldBBallTeam.csv'}]}
>>> intro_dataset.describe('datadotworldbballstats')
{'format': 'csv',
 'name': 'datadotworldbballstats',
 'path': 'data/DataDotWorldBBallStats.csv',
 'schema': {'fields': [{'name': 'Name', 'title': 'Name', 'type': 'string'},
                       {'name': 'PointsPerGame',
                        'title': 'PointsPerGame',
                        'type': 'number'},
                       {'name': 'AssistsPerGame',
                        'title': 'AssistsPerGame',
                        'type': 'number'}]}}

Reference

Standalone functions

load_dataset(dataset_key, force_update=False, auto_update=False)

Load a dataset from the local filesystem, downloading it from data.world first, if necessary.

This function returns an object of type LocalDataset. The object allows access to metedata via it’s describe() method and to all the data via three properties raw_data, tables and dataframes, all of which are mappings (dict-like structures).

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id or of a url

  • force_update (bool) – Flag, indicating if a new copy of the dataset should be downloaded replacing any previously downloaded copy (Default value = False)

  • auto_update (bool) – Flag, indicating that dataset be updated to the latest version

Returns

The object representing the dataset

Return type

LocalDataset

Raises

RestApiError – If a server error occurs

open_remote_file(dataset_key, file_name, mode='w', **kwargs)

Open a remote file object that can be used to write to or read from a file in a data.world dataset

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id

  • file_name (str) – The name of the file to open

  • mode (str, optional) – the mode for the file - must be ‘w’, ‘wb’, ‘r’, or ‘rb’ - indicating read/write (‘r’/’w’) and optionally “binary” handling of the file data. (Default value = ‘w’)

  • chunk_size (int, optional) – size of chunked bytes to return when reading streamed bytes in ‘rb’ mode

  • decode_unicode (bool, optional) – whether to decode textual responses as unicode when returning streamed lines in ‘r’ mode

  • **kwargs

Examples

>>> importdatadotworldasdw>>>>>> # write a text file>>> withdw.open_remote_file('username/test-dataset',... 'test.txt')asw:... w.write("this is a test.")>>>>>> # write a jsonlines file>>> importjson>>> withdw.open_remote_file('username/test-dataset',... 'test.jsonl')asw:... json.dump({'foo':42,'bar':"A"},w)... w.write("\n")... json.dump({'foo':13,'bar':"B"},w)... w.write("\n")>>>>>> # write a csv file>>> importcsv>>> withdw.open_remote_file('username/test-dataset',... 'test.csv')asw:... csvw=csv.DictWriter(w,fieldnames=['foo','bar'])... csvw.writeheader()... csvw.writerow({'foo':42,'bar':"A"})... csvw.writerow({'foo':13,'bar':"B"})>>>>>> # write a pandas dataframe as a csv file>>> importpandasaspd>>> df=pd.DataFrame({'foo':[1,2,3,4],'bar':['a','b','c','d']})>>> withdw.open_remote_file('username/test-dataset',... 'dataframe.csv')asw:... df.to_csv(w,index=False)>>>>>> # write a binary file>>> withdw.open_remote_file('username/test-dataset',>>> 'test.txt',mode='wb')asw:... w.write(bytes([100,97,116,97,46,119,111,114,108,100]))>>>>>> # read a text file>>> withdw.open_remote_file('username/test-dataset',... 'test.txt',mode='r')asr:... print(r.read())>>>>>> # read a csv file>>> withdw.open_remote_file('username/test-dataset',... 'test.csv',mode='r')asr:... csvr=csv.DictReader(r)... forrowincsvr:... print(row['column a'],row['column b'])>>>>>> # read a binary file>>> withdw.open_remote_file('username/test-dataset',... 'test',mode='rb')asr:... bytes=r.read()
query(dataset_key, query, query_type='sql', parameters=None)

Query an existing dataset

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id or of a url

  • query (str) – SQL or SPARQL query

  • query_type ({'sql', 'sparql'}, optional) – The type of the query. Must be either ‘sql’ or ‘sparql’. (Default value = “sql”)

  • parameters (query parameters, optional) – parameters to the query - if SPARQL query, this should be a dict containing named parameters, if SQL query,then this should be a list containing positional parameters. Boolean values will be converted to xsd:boolean, Integer values to xsd:integer, and other Numeric values to xsd:decimal. Anything else is treated as a String literal (Default value = None)

Returns

Object containing the results of the query

Return type

QueryResults

Raises

RuntimeError – If a server error occurs

API Client Methods

The following functions are all methods of the datadotworld.api_client() class

add_files_via_url(dataset_key, files={})

Add or update dataset files linked to source URLs

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id

  • files (dict) – Dict containing the name of files and metadata Uses file name as a dict containing File description, labels and source URLs to add or update (Default value = {}) description and labels are optional.

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> url='http://www.acme.inc/example.csv'>>> api_client=dw.api_client()>>> api_client.add_files_via_url(... 'username/test-dataset',... {'example.csv':{... 'url':url,... 'labels':['raw data'],... 'description':'file description'}})
add_linked_dataset(project_key, dataset_key)

Link project to an existing dataset

This method links a dataset to project

Parameters
  • project_key (str) – Project identifier, in the form of owner/id

  • dataset_key – Dataset identifier, in the form of owner/id

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> linked_dataset=api_client.add_linked_dataset(... 'username/test-project',... 'username/test-dataset')
append_records(dataset_key, stream_id, body)

Append records to a stream.

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id

  • stream_id (str) – Stream unique identifier.

  • body (obj) – Object body

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.append_records('username/test-dataset','streamId',... {'content':'content'})
create_dataset(owner_id, **kwargs)

Create a new dataset

Parameters
  • owner_id (str) – Username of the owner of the new dataset

  • title (str) – Dataset title (will be used to generate dataset id on creation)

  • description (str, optional) – Dataset description

  • summary (str, optional) – Dataset summary markdown

  • tags (list, optional) – Dataset tags

  • license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Dataset license

  • visibility ({'OPEN', 'PRIVATE'}) – Dataset visibility

  • files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties

Returns

Newly created dataset key

Return type

str

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> url='http://www.acme.inc/example.csv'>>> api_client.create_dataset(... 'username',title='Test dataset',visibility='PRIVATE',... license='Public Domain',... files={'dataset.csv':{'url':url}})
create_insight(project_key, **kwargs)

Create a new insight

Parameters
  • project_key (str) – Project identifier, in the form of

  • title (str) – Insight title

  • description (str, optional) – Insight description.

  • image_url (str) – If image-based, the URL of the image

  • embed_url (str) – If embed-based, the embeddable URL

  • source_link (str, optional) – Permalink to source code or platform this insight was generated with. Allows others to replicate the steps originally used to produce the insight.

  • data_source_links (array) – One or more permalinks to the data sources used to generate this insight. Allows others to access the data originally used to produce the insight.

Returns

Insight with message and uri object

Return type

object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.create_insight(... 'projectOwner/projectid',title='Test insight',... image_url='url')
create_project(owner_id, **kwargs)

Create a new project

Parameters
  • owner_id (str) – Username of the creator of a project.

  • title (str) – Project title (will be used to generate project id on creation)

  • objective (str, optional) – Short project objective.

  • summary (str, optional) – Long-form project summary.

  • tags (list, optional) – Project tags. Letters numbers and spaces

  • license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Project license

  • visibility ({'OPEN', 'PRIVATE'}) – Project visibility

  • files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties

  • linked_datasets (list of object, optional) – Initial set of linked datasets.

Returns

Newly created project key

Return type

str

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.create_project(... 'username',title='project testing',... visibility='PRIVATE',... linked_datasets=[{'owner':'someuser',... 'id':'somedataset'}])
delete_dataset(dataset_key)

Deletes a dataset and all associated data

Parameters

dataset_key (str) – Dataset identifier, in the form of owner/id

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.delete_dataset(... 'username/dataset')
delete_files(dataset_key, names)

Delete dataset file(s)

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id

  • names (list of str) – The list of names for files to be deleted

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.delete_files(... 'username/test-dataset',['example.csv'])
delete_insight(project_key, insight_id)

Delete an existing insight.

Parameters
  • project_key (str) – Project identifier, in the form of projectOwner/projectId

  • insight_id (str) – Insight unique id

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> del_insight=api_client.delete_insight(... 'username/project','insightid')
delete_project(project_key)

Deletes a project and all associated data

Parameters

project_key (str) – Project identifier, in the form of owner/id

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.delete_project(... 'username/test-project')
download_datapackage(dataset_key, dest_dir)

Download and unzip a dataset’s datapackage

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id

  • dest_dir (str or path) – Directory under which datapackage should be saved

Returns

Location of the datapackage descriptor (datapackage.json) in the local filesystem

Return type

path

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> datapackage_descriptor=api_client.download_datapackage(... 'jonloyens/an-intro-to-dataworld-dataset',... '/tmp/test')>>> datapackage_descriptor'/tmp/test/datapackage.json'
download_dataset(dataset_key)

Return a .zip containing all files within the dataset as uploaded.

Parameters

dataset_key (str) – Dataset identifier, in the form of owner/id

Returns

.zip file contain files within dataset

Return type

file object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.download_dataset(... 'username/test-dataset')
download_file(dataset_key, file)

Return a file within the dataset as uploaded.

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id

  • file (str) – File path to be returned

Returns

file in which the data was uploaded

Return type

file object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.download_file('username/test-dataset',... '/my/local/example.csv')
fetch_contributing_datasets(**kwargs)

Fetch datasets that the authenticated user has access to

Parameters
  • limit (str, optional) – Maximum number of items to include in a page of results

  • next (str, optional) – Token from previous result page (to be used when requesting a subsequent page)

  • sort (str, optional) – Property name to sort

Returns

Authenticated user dataset

Return type

dict

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_dataset=... api_client.fetch_contributing_datasets(){'count': 0, 'records': [], 'next_page_token': None}
fetch_contributing_projects(**kwargs)

Fetch projects that the currently authenticated user has access to

Returns

Authenticated user projects

Return type

dict

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_projects=... api_client.fetch_contributing_projects(){'count': 0, 'records': [], 'next_page_token': None}
fetch_datasets(**kwargs)

Fetch authenticated user owned datasets

Parameters
  • limit (str, optional) – Maximum number of items to include in a page of results

  • next (str, optional) – Token from previous result page (to be used when requesting a subsequent page)

  • sort (str, optional) – Property name to sort

Returns

Dataset definition, with all attributes

Return type

dict

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_owned_dataset=api_client.fetch_datasets()
fetch_liked_datasets(**kwargs)

Fetch datasets that authenticated user likes

Parameters
  • limit (str, optional) – Maximum number of items to include in a page of results

  • next (str, optional) – Token from previous result page (to be used when requesting a subsequent page)

  • sort (str, optional) – Property name to sort

Returns

Dataset definition, with all attributes

Return type

dict

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_liked_dataset=... api_client.fetch_liked_datasets()
fetch_liked_projects(**kwargs)

Fetch projects that the currently authenticated user likes

Returns

Authenticated user projects

Return type

dict

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_liked_projects=... api_client.fetch_liked_projects()
fetch_projects(**kwargs)

Fetch projects that the currently authenticated user owns

Returns

Authenticated user projects

Return type

dict

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_projects=... api_client.fetch_projects()
get_dataset(dataset_key)

Retrieve an existing dataset definition

This method retrieves metadata about an existing

Parameters

dataset_key (str) – Dataset identifier, in the form of owner/id

Returns

Dataset definition, with all attributes

Return type

dict

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> intro_dataset=api_client.get_dataset(... 'jonloyens/an-intro-to-dataworld-dataset')>>> intro_dataset['title']'An Intro to data.world Dataset'
get_insight(project_key, insight_id, **kwargs)

Retrieve an insight

Parameters
  • project_key (str) – Project identifier, in the form of projectOwner/projectid

  • insight_id (str) – Insight unique identifier.

Returns

Insight definition, with all attributes

Return type

object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> insight=api_client.get_insight(... 'jonloyens/'... 'an-example-project-that-shows-what-to-put-in-data-world',... 'c2538b0c-c200-474c-9631-5ff4f13026eb')>>> insight['title']'Coast Guard Lives Saved by Fiscal Year'
get_insights_for_project(project_key, **kwargs)

Get insights for a project.

Parameters

project_key (str) – Project identifier, in the form of projectOwner/projectid

Returns

Insight results

Return type

object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> insights=api_client.get_insights_for_project(... 'jonloyens/'... 'an-example-project-that-shows-what-to-put-in-data-world'... )
get_project(project_key)

Retrieve an existing project

This method retrieves metadata about an existing project

Parameters

project_key (str) – Project identifier, in the form of owner/id

Returns

Project definition, with all attributes

Return type

dict

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> intro_project=api_client.get_project(... 'jonloyens/'... 'an-example-project-that-shows-what-to-put-in-data-world'... )>>> intro_project['title']'An Example Project that Shows What To Put in data.world'
get_user_data()

Retrieve data for authenticated user

Returns

User data, with all attributes

Return type

dict

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_data=api_client.get_user_data()>>> user_data[display_name]'Name User'
remove_linked_dataset(project_key, dataset_key)

Unlink dataset

This method unlinks a dataset from a project

Parameters
  • project_key (str) – Project identifier, in the form of owner/id

  • dataset_key – Dataset identifier, in the form of owner/id

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.remove_linked_dataset(... 'username/test-project',... 'username/test-dataset')
replace_dataset(dataset_key, **kwargs)

Replace an existing dataset

This method will completely overwrite an existing dataset.

Parameters
  • description (str, optional) – Dataset description

  • summary (str, optional) – Dataset summary markdown

  • tags (list, optional) – Dataset tags

  • license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Dataset license

  • visibility ({'OPEN', 'PRIVATE'}) – Dataset visibility

  • files (dict, optional) – File names and source URLs to add or update

  • dataset_key (str) – Dataset identifier, in the form of owner/id

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.replace_dataset(... 'username/test-dataset',... visibility='PRIVATE',license='Public Domain',... description='A better description')
replace_insight(project_key, insight_id, **kwargs)

Replace an insight.

Parameters
  • project_key (str) – Projrct identifier, in the form of projectOwner/projectid

  • insight_id (str) – Insight unique identifier.

  • title (str) – Insight title

  • description (str, optional) – Insight description.

  • image_url (str) – If image-based, the URL of the image

  • embed_url (str) – If embed-based, the embeddable URL

  • source_link (str, optional) – Permalink to source code or platform this insight was generated with. Allows others to replicate the steps originally used to produce the insight.

  • data_source_links (array) – One or more permalinks to the data sources used to generate this insight. Allows others to access the data originally used to produce the insight.

Returns

message object

Return type

object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.replace_insight(... 'projectOwner/projectid',... '1230-9324-3424242442',... embed_url='url',... title='Test insight')
replace_project(project_key, **kwargs)

Replace an existing Project

Create a project with a given id or completely rewrite the project, including any previously added files or linked datasets, if one already exists with the given id.

Parameters
  • project_key (str) – Username and unique identifier of the creator of a project in the form of owner/id.

  • title (str) – Project title

  • objective (str, optional) – Short project objective.

  • summary (str, optional) – Long-form project summary.

  • tags (list, optional) – Project tags. Letters numbers and spaces

  • license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Project license

  • visibility ({'OPEN', 'PRIVATE'}) – Project visibility

  • files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties

  • linked_datasets (list of object, optional) – Initial set of linked datasets.

Returns

project object

Return type

object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.replace_project(... 'username/test-project',... visibility='PRIVATE',... objective='A better objective',... title='Replace project')
sparql(dataset_key, query, desired_mimetype='application/sparql-results+json', **kwargs)

Executes SPARQL queries against a dataset via POST

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id

  • query (str) – SPARQL query

Returns

file object that can be used in file parsers and data handling modules.

Return type

file object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.sparql_post('username/test-dataset',... query)
sql(dataset_key, query, desired_mimetype='application/json', **kwargs)

Executes SQL queries against a dataset via POST

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id

  • query (str) – SQL query

  • include_table_schema (bool) – Flags indicating to include table schema in the response

Returns

file object that can be used in file parsers and data handling modules.

Return type

file-like object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.sql('username/test-dataset','query')
sync_files(dataset_key)

Trigger synchronization process to update all dataset files linked to source URLs.

Parameters

dataset_key (str) – Dataset identifier, in the form of owner/id

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.sync_files('username/test-dataset')
update_dataset(dataset_key, **kwargs)

Update an existing dataset

Parameters
  • description (str, optional) – Dataset description

  • summary (str, optional) – Dataset summary markdown

  • tags (list, optional) – Dataset tags

  • license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Dataset license

  • visibility ({'OPEN', 'PRIVATE'}, optional) – Dataset visibility

  • files (dict, optional) – File names and source URLs to add or update

  • dataset_key (str) – Dataset identifier, in the form of owner/id

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.update_dataset(... 'username/test-dataset',... tags=['demo','datadotworld'])
update_insight(project_key, insight_id, **kwargs)

Update an insight.

Note that only elements included in the request will be updated. All omitted elements will remain untouched.

Parameters
  • project_key (str) – Projrct identifier, in the form of projectOwner/projectid

  • insight_id (str) – Insight unique identifier.

  • title (str) – Insight title

  • description (str, optional) – Insight description.

  • image_url (str) – If image-based, the URL of the image

  • embed_url (str) – If embed-based, the embeddable URL

  • source_link (str, optional) – Permalink to source code or platform this insight was generated with. Allows others to replicate the steps originally used to produce the insight.

  • data_source_links (array) – One or more permalinks to the data sources used to generate this insight. Allows others to access the data originally used to produce the insight.

Returns

message object

Return type

object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.update_insight(... 'username/test-project','insightid'... title='demo atadotworld'})
update_project(project_key, **kwargs)

Update an existing project

Parameters
  • project_key (str) – Username and unique identifier of the creator of a project in the form of owner/id.

  • title (str) – Project title

  • objective (str, optional) – Short project objective.

  • summary (str, optional) – Long-form project summary.

  • tags (list, optional) – Project tags. Letters numbers and spaces

  • license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Project license

  • visibility ({'OPEN', 'PRIVATE'}) – Project visibility

  • files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties

  • linked_datasets (list of object, optional) – Initial set of linked datasets.

Returns

message object

Return type

object

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.update_project(... 'username/test-project',... tags=['demo','datadotworld'])
upload_files(dataset_key, files, files_metadata={}, **kwargs)

Upload one or more dataset files

Parameters
  • dataset_key (str) – Dataset identifier, in the form of owner/id

  • files (list of str) – The list of names/paths for files stored in the local filesystem

  • expand_archives – Boolean value to indicate files should be expanded upon upload

  • files_metadata (dict optional) – Dict containing the name of files and metadata Uses file name as a dict containing File description, labels and source URLs to add or update

Raises

RestApiException – If a server error occurs

Examples

>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.upload_files(... 'username/test-dataset',... ['/my/local/example.csv'])