Python
Data.world has developed an open source Python library for working with data.world datasets.
This library makes it easy for data.world users to pull and work with data stored on data.world. Additionally, the library provides convenient wrappers for data.world APIs, allowing users to create and update datasets, add and modify files, and possibly implement entire apps on top of data.world.
Installation
You can install data.world Python library using pip
directly from PyPI:
pip install datadotworld
Optionally, you can install the library including pandas support:
pip install datadotworld[pandas]
If you use conda
to manage your python distribution, you can install from the community-maintained [conda-forge](https://conda-forge.github.io/) channel:
conda install -c conda-forge datadotworld-py
Configuration
This library requires a data.world API authentication token to work.
Your authentication token can be obtained on data.world once you enable Python under Integrations > Python
To configure the library, run the following command:
dw configure
Alternatively, tokens can be provided via the DW_AUTH_TOKEN
environment variable. On MacOS or Unix machines, run (replacing <YOUR_TOKEN>>
below with the token obtained earlier):
export DW_AUTH_TOKEN=<YOUR_TOKEN>
Define content for SDK host
If you want to connect to the public data.world api server you will use the default settings. However if you want to connect to a private data.world environment (e.g., your_org.data.world), you can use environment variables to define the host:
def create_url(subdomain, environment): if environment: subdomain = subdomain + '.' + environment return 'https://{}.data.world'.format(subdomain) DW_ENVIRONMENT = environ.get('DW_ENVIRONMENT', '') API_HOST = environ.get('DW_API_HOST', create_url('api', DW_ENVIRONMENT)) DOWNLOAD_HOST = environ.get( 'DW_DOWNLOAD_HOST', create_url('download', DW_ENVIRONMENT)) QUERY_HOST = environ.get('DW_QUERY_HOST', create_url('query', DW_ENVIRONMENT))
Examples
The load_dataset()
function facilitates maintaining copies of datasets on the local filesystem. It will download a given dataset's datapackage and store it under ~/.dw/cache
. When used subsequently, load_dataset()
will use the copy stored on disk and will work offline, unless it's called with force_update=True
or auto_update=True
. force_update=True
will overwrite your local copy unconditionally. auto_update=True
will only overwrite your local copy if a newer version of the dataset is available on data.world.
Once loaded, a dataset (data and metadata) can be conveniently accessed via the object returned by load_dataset()
.
Start by importing the datadotworld
module:
import datadotworld as dw
Then, invoke the load_dataset()
function, to download a dataset and work with it locally. For example:
intro_dataset = dw.load_dataset('jonloyens/an-intro-to-dataworld-dataset')
Dataset objects allow access to data via three different properties raw_data
, tables
and dataframes
. Each of these properties is a mapping (dict) whose values are of type bytes
, list
and pandas.DataFrame
, respectively. Values are lazy loaded and cached once loaded. Their keys are the names of the files contained in the dataset.
For example:
>>> intro_dataset.dataframes LazyLoadedDict({ 'changelog': LazyLoadedValue(<pandas.DataFrame>), 'datadotworldbballstats': LazyLoadedValue(<pandas.DataFrame>), 'datadotworldbballteam': LazyLoadedValue(<pandas.DataFrame>)})
IMPORTANT: Not all files in a dataset are tabular, therefore some will be exposed via raw_data
only.
Tables are lists of rows, each represented by a mapping (dict) of column names to their respective values.
For example:
>>> stats_table = intro_dataset.tables['datadotworldbballstats'] >>> stats_table[0] OrderedDict([('Name', 'Jon'), ('PointsPerGame', Decimal('20.4')), ('AssistsPerGame', Decimal('1.3'))])
You can also review the metadata associated with a file or the entire dataset, using the describe
function. For example:
>>> intro_dataset.describe() {'homepage': 'https://data.world/jonloyens/an-intro-to-dataworld-dataset', 'name': 'jonloyens_an-intro-to-dataworld-dataset', 'resources': [{'format': 'csv', 'name': 'changelog', 'path': 'data/ChangeLog.csv'}, {'format': 'csv', 'name': 'datadotworldbballstats', 'path': 'data/DataDotWorldBBallStats.csv'}, {'format': 'csv', 'name': 'datadotworldbballteam', 'path': 'data/DataDotWorldBBallTeam.csv'}]} >>> intro_dataset.describe('datadotworldbballstats') {'format': 'csv', 'name': 'datadotworldbballstats', 'path': 'data/DataDotWorldBBallStats.csv', 'schema': {'fields': [{'name': 'Name', 'title': 'Name', 'type': 'string'}, {'name': 'PointsPerGame', 'title': 'PointsPerGame', 'type': 'number'}, {'name': 'AssistsPerGame', 'title': 'AssistsPerGame', 'type': 'number'}]}}
Reference
Standalone functions
load_dataset
(dataset_key, force_update=False, auto_update=False)Load a dataset from the local filesystem, downloading it from data.world first, if necessary.
This function returns an object of type LocalDataset. The object allows access to metedata via it’s describe() method and to all the data via three properties raw_data, tables and dataframes, all of which are mappings (dict-like structures).
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id or of a url
force_update (bool) – Flag, indicating if a new copy of the dataset should be downloaded replacing any previously downloaded copy (Default value = False)
auto_update (bool) – Flag, indicating that dataset be updated to the latest version
- Returns
The object representing the dataset
- Return type
LocalDataset
- Raises
RestApiError – If a server error occurs
open_remote_file
(dataset_key, file_name, mode='w', **kwargs)Open a remote file object that can be used to write to or read from a file in a data.world dataset
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
file_name (str) – The name of the file to open
mode (str, optional) – the mode for the file - must be ‘w’, ‘wb’, ‘r’, or ‘rb’ - indicating read/write (‘r’/’w’) and optionally “binary” handling of the file data. (Default value = ‘w’)
chunk_size (int, optional) – size of chunked bytes to return when reading streamed bytes in ‘rb’ mode
decode_unicode (bool, optional) – whether to decode textual responses as unicode when returning streamed lines in ‘r’ mode
**kwargs –
Examples
>>> importdatadotworldasdw>>>>>> # write a text file>>> withdw.open_remote_file('username/test-dataset',... 'test.txt')asw:... w.write("this is a test.")>>>>>> # write a jsonlines file>>> importjson>>> withdw.open_remote_file('username/test-dataset',... 'test.jsonl')asw:... json.dump({'foo':42,'bar':"A"},w)... w.write("\n")... json.dump({'foo':13,'bar':"B"},w)... w.write("\n")>>>>>> # write a csv file>>> importcsv>>> withdw.open_remote_file('username/test-dataset',... 'test.csv')asw:... csvw=csv.DictWriter(w,fieldnames=['foo','bar'])... csvw.writeheader()... csvw.writerow({'foo':42,'bar':"A"})... csvw.writerow({'foo':13,'bar':"B"})>>>>>> # write a pandas dataframe as a csv file>>> importpandasaspd>>> df=pd.DataFrame({'foo':[1,2,3,4],'bar':['a','b','c','d']})>>> withdw.open_remote_file('username/test-dataset',... 'dataframe.csv')asw:... df.to_csv(w,index=False)>>>>>> # write a binary file>>> withdw.open_remote_file('username/test-dataset',>>> 'test.txt',mode='wb')asw:... w.write(bytes([100,97,116,97,46,119,111,114,108,100]))>>>>>> # read a text file>>> withdw.open_remote_file('username/test-dataset',... 'test.txt',mode='r')asr:... print(r.read())>>>>>> # read a csv file>>> withdw.open_remote_file('username/test-dataset',... 'test.csv',mode='r')asr:... csvr=csv.DictReader(r)... forrowincsvr:... print(row['column a'],row['column b'])>>>>>> # read a binary file>>> withdw.open_remote_file('username/test-dataset',... 'test',mode='rb')asr:... bytes=r.read()
query
(dataset_key, query, query_type='sql', parameters=None)Query an existing dataset
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id or of a url
query (str) – SQL or SPARQL query
query_type ({'sql', 'sparql'}, optional) – The type of the query. Must be either ‘sql’ or ‘sparql’. (Default value = “sql”)
parameters (query parameters, optional) – parameters to the query - if SPARQL query, this should be a dict containing named parameters, if SQL query,then this should be a list containing positional parameters. Boolean values will be converted to xsd:boolean, Integer values to xsd:integer, and other Numeric values to xsd:decimal. Anything else is treated as a String literal (Default value = None)
- Returns
Object containing the results of the query
- Return type
QueryResults
- Raises
RuntimeError – If a server error occurs
API Client Methods
The following functions are all methods of the datadotworld.api_client() class
add_files_via_url
(dataset_key, files={})Add or update dataset files linked to source URLs
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
files (dict) – Dict containing the name of files and metadata Uses file name as a dict containing File description, labels and source URLs to add or update (Default value = {}) description and labels are optional.
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> url='http://www.acme.inc/example.csv'>>> api_client=dw.api_client()>>> api_client.add_files_via_url(... 'username/test-dataset',... {'example.csv':{... 'url':url,... 'labels':['raw data'],... 'description':'file description'}})
add_linked_dataset
(project_key, dataset_key)Link project to an existing dataset
This method links a dataset to project
- Parameters
project_key (str) – Project identifier, in the form of owner/id
dataset_key – Dataset identifier, in the form of owner/id
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> linked_dataset=api_client.add_linked_dataset(... 'username/test-project',... 'username/test-dataset')
append_records
(dataset_key, stream_id, body)Append records to a stream.
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
stream_id (str) – Stream unique identifier.
body (obj) – Object body
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.append_records('username/test-dataset','streamId',... {'content':'content'})
create_dataset
(owner_id, **kwargs)Create a new dataset
- Parameters
owner_id (str) – Username of the owner of the new dataset
title (str) – Dataset title (will be used to generate dataset id on creation)
description (str, optional) – Dataset description
summary (str, optional) – Dataset summary markdown
tags (list, optional) – Dataset tags
license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Dataset license
visibility ({'OPEN', 'PRIVATE'}) – Dataset visibility
files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties
- Returns
Newly created dataset key
- Return type
str
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> url='http://www.acme.inc/example.csv'>>> api_client.create_dataset(... 'username',title='Test dataset',visibility='PRIVATE',... license='Public Domain',... files={'dataset.csv':{'url':url}})
create_insight
(project_key, **kwargs)Create a new insight
- Parameters
project_key (str) – Project identifier, in the form of
title (str) – Insight title
description (str, optional) – Insight description.
image_url (str) – If image-based, the URL of the image
embed_url (str) – If embed-based, the embeddable URL
source_link (str, optional) – Permalink to source code or platform this insight was generated with. Allows others to replicate the steps originally used to produce the insight.
data_source_links (array) – One or more permalinks to the data sources used to generate this insight. Allows others to access the data originally used to produce the insight.
- Returns
Insight with message and uri object
- Return type
object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.create_insight(... 'projectOwner/projectid',title='Test insight',... image_url='url')
create_project
(owner_id, **kwargs)Create a new project
- Parameters
owner_id (str) – Username of the creator of a project.
title (str) – Project title (will be used to generate project id on creation)
objective (str, optional) – Short project objective.
summary (str, optional) – Long-form project summary.
tags (list, optional) – Project tags. Letters numbers and spaces
license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Project license
visibility ({'OPEN', 'PRIVATE'}) – Project visibility
files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties
linked_datasets (list of object, optional) – Initial set of linked datasets.
- Returns
Newly created project key
- Return type
str
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.create_project(... 'username',title='project testing',... visibility='PRIVATE',... linked_datasets=[{'owner':'someuser',... 'id':'somedataset'}])
delete_dataset
(dataset_key)Deletes a dataset and all associated data
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.delete_dataset(... 'username/dataset')
delete_files
(dataset_key, names)Delete dataset file(s)
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
names (list of str) – The list of names for files to be deleted
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.delete_files(... 'username/test-dataset',['example.csv'])
delete_insight
(project_key, insight_id)Delete an existing insight.
- Parameters
project_key (str) – Project identifier, in the form of projectOwner/projectId
insight_id (str) – Insight unique id
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> del_insight=api_client.delete_insight(... 'username/project','insightid')
delete_project
(project_key)Deletes a project and all associated data
- Parameters
project_key (str) – Project identifier, in the form of owner/id
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.delete_project(... 'username/test-project')
download_datapackage
(dataset_key, dest_dir)Download and unzip a dataset’s datapackage
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
dest_dir (str or path) – Directory under which datapackage should be saved
- Returns
Location of the datapackage descriptor (datapackage.json) in the local filesystem
- Return type
path
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> datapackage_descriptor=api_client.download_datapackage(... 'jonloyens/an-intro-to-dataworld-dataset',... '/tmp/test')>>> datapackage_descriptor'/tmp/test/datapackage.json'
download_dataset
(dataset_key)Return a .zip containing all files within the dataset as uploaded.
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
- Returns
.zip file contain files within dataset
- Return type
file object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.download_dataset(... 'username/test-dataset')
download_file
(dataset_key, file)Return a file within the dataset as uploaded.
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
file (str) – File path to be returned
- Returns
file in which the data was uploaded
- Return type
file object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.download_file('username/test-dataset',... '/my/local/example.csv')
fetch_contributing_datasets
(**kwargs)Fetch datasets that the authenticated user has access to
- Parameters
limit (str, optional) – Maximum number of items to include in a page of results
next (str, optional) – Token from previous result page (to be used when requesting a subsequent page)
sort (str, optional) – Property name to sort
- Returns
Authenticated user dataset
- Return type
dict
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_dataset=... api_client.fetch_contributing_datasets(){'count': 0, 'records': [], 'next_page_token': None}
fetch_contributing_projects
(**kwargs)Fetch projects that the currently authenticated user has access to
- Returns
Authenticated user projects
- Return type
dict
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_projects=... api_client.fetch_contributing_projects(){'count': 0, 'records': [], 'next_page_token': None}
fetch_datasets
(**kwargs)Fetch authenticated user owned datasets
- Parameters
limit (str, optional) – Maximum number of items to include in a page of results
next (str, optional) – Token from previous result page (to be used when requesting a subsequent page)
sort (str, optional) – Property name to sort
- Returns
Dataset definition, with all attributes
- Return type
dict
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_owned_dataset=api_client.fetch_datasets()
fetch_liked_datasets
(**kwargs)Fetch datasets that authenticated user likes
- Parameters
limit (str, optional) – Maximum number of items to include in a page of results
next (str, optional) – Token from previous result page (to be used when requesting a subsequent page)
sort (str, optional) – Property name to sort
- Returns
Dataset definition, with all attributes
- Return type
dict
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_liked_dataset=... api_client.fetch_liked_datasets()
fetch_liked_projects
(**kwargs)Fetch projects that the currently authenticated user likes
- Returns
Authenticated user projects
- Return type
dict
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_liked_projects=... api_client.fetch_liked_projects()
fetch_projects
(**kwargs)Fetch projects that the currently authenticated user owns
- Returns
Authenticated user projects
- Return type
dict
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_projects=... api_client.fetch_projects()
get_dataset
(dataset_key)Retrieve an existing dataset definition
This method retrieves metadata about an existing
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
- Returns
Dataset definition, with all attributes
- Return type
dict
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> intro_dataset=api_client.get_dataset(... 'jonloyens/an-intro-to-dataworld-dataset')>>> intro_dataset['title']'An Intro to data.world Dataset'
get_insight
(project_key, insight_id, **kwargs)Retrieve an insight
- Parameters
project_key (str) – Project identifier, in the form of projectOwner/projectid
insight_id (str) – Insight unique identifier.
- Returns
Insight definition, with all attributes
- Return type
object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> insight=api_client.get_insight(... 'jonloyens/'... 'an-example-project-that-shows-what-to-put-in-data-world',... 'c2538b0c-c200-474c-9631-5ff4f13026eb')>>> insight['title']'Coast Guard Lives Saved by Fiscal Year'
get_insights_for_project
(project_key, **kwargs)Get insights for a project.
- Parameters
project_key (str) – Project identifier, in the form of projectOwner/projectid
- Returns
Insight results
- Return type
object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> insights=api_client.get_insights_for_project(... 'jonloyens/'... 'an-example-project-that-shows-what-to-put-in-data-world'... )
get_project
(project_key)Retrieve an existing project
This method retrieves metadata about an existing project
- Parameters
project_key (str) – Project identifier, in the form of owner/id
- Returns
Project definition, with all attributes
- Return type
dict
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> intro_project=api_client.get_project(... 'jonloyens/'... 'an-example-project-that-shows-what-to-put-in-data-world'... )>>> intro_project['title']'An Example Project that Shows What To Put in data.world'
get_user_data
()Retrieve data for authenticated user
- Returns
User data, with all attributes
- Return type
dict
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> user_data=api_client.get_user_data()>>> user_data[display_name]'Name User'
remove_linked_dataset
(project_key, dataset_key)Unlink dataset
This method unlinks a dataset from a project
- Parameters
project_key (str) – Project identifier, in the form of owner/id
dataset_key – Dataset identifier, in the form of owner/id
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.remove_linked_dataset(... 'username/test-project',... 'username/test-dataset')
replace_dataset
(dataset_key, **kwargs)Replace an existing dataset
This method will completely overwrite an existing dataset.
- Parameters
description (str, optional) – Dataset description
summary (str, optional) – Dataset summary markdown
tags (list, optional) – Dataset tags
license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Dataset license
visibility ({'OPEN', 'PRIVATE'}) – Dataset visibility
files (dict, optional) – File names and source URLs to add or update
dataset_key (str) – Dataset identifier, in the form of owner/id
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.replace_dataset(... 'username/test-dataset',... visibility='PRIVATE',license='Public Domain',... description='A better description')
replace_insight
(project_key, insight_id, **kwargs)Replace an insight.
- Parameters
project_key (str) – Projrct identifier, in the form of projectOwner/projectid
insight_id (str) – Insight unique identifier.
title (str) – Insight title
description (str, optional) – Insight description.
image_url (str) – If image-based, the URL of the image
embed_url (str) – If embed-based, the embeddable URL
source_link (str, optional) – Permalink to source code or platform this insight was generated with. Allows others to replicate the steps originally used to produce the insight.
data_source_links (array) – One or more permalinks to the data sources used to generate this insight. Allows others to access the data originally used to produce the insight.
- Returns
message object
- Return type
object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.replace_insight(... 'projectOwner/projectid',... '1230-9324-3424242442',... embed_url='url',... title='Test insight')
replace_project
(project_key, **kwargs)Replace an existing Project
Create a project with a given id or completely rewrite the project, including any previously added files or linked datasets, if one already exists with the given id.
- Parameters
project_key (str) – Username and unique identifier of the creator of a project in the form of owner/id.
title (str) – Project title
objective (str, optional) – Short project objective.
summary (str, optional) – Long-form project summary.
tags (list, optional) – Project tags. Letters numbers and spaces
license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Project license
visibility ({'OPEN', 'PRIVATE'}) – Project visibility
files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties
linked_datasets (list of object, optional) – Initial set of linked datasets.
- Returns
project object
- Return type
object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.replace_project(... 'username/test-project',... visibility='PRIVATE',... objective='A better objective',... title='Replace project')
sparql
(dataset_key, query, desired_mimetype='application/sparql-results+json', **kwargs)Executes SPARQL queries against a dataset via POST
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
query (str) – SPARQL query
- Returns
file object that can be used in file parsers and data handling modules.
- Return type
file object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.sparql_post('username/test-dataset',... query)
sql
(dataset_key, query, desired_mimetype='application/json', **kwargs)Executes SQL queries against a dataset via POST
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
query (str) – SQL query
include_table_schema (bool) – Flags indicating to include table schema in the response
- Returns
file object that can be used in file parsers and data handling modules.
- Return type
file-like object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.sql('username/test-dataset','query')
sync_files
(dataset_key)Trigger synchronization process to update all dataset files linked to source URLs.
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.sync_files('username/test-dataset')
update_dataset
(dataset_key, **kwargs)Update an existing dataset
- Parameters
description (str, optional) – Dataset description
summary (str, optional) – Dataset summary markdown
tags (list, optional) – Dataset tags
license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Dataset license
visibility ({'OPEN', 'PRIVATE'}, optional) – Dataset visibility
files (dict, optional) – File names and source URLs to add or update
dataset_key (str) – Dataset identifier, in the form of owner/id
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.update_dataset(... 'username/test-dataset',... tags=['demo','datadotworld'])
update_insight
(project_key, insight_id, **kwargs)Update an insight.
Note that only elements included in the request will be updated. All omitted elements will remain untouched.
- Parameters
project_key (str) – Projrct identifier, in the form of projectOwner/projectid
insight_id (str) – Insight unique identifier.
title (str) – Insight title
description (str, optional) – Insight description.
image_url (str) – If image-based, the URL of the image
embed_url (str) – If embed-based, the embeddable URL
source_link (str, optional) – Permalink to source code or platform this insight was generated with. Allows others to replicate the steps originally used to produce the insight.
data_source_links (array) – One or more permalinks to the data sources used to generate this insight. Allows others to access the data originally used to produce the insight.
- Returns
message object
- Return type
object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.update_insight(... 'username/test-project','insightid'... title='demo atadotworld'})
update_project
(project_key, **kwargs)Update an existing project
- Parameters
project_key (str) – Username and unique identifier of the creator of a project in the form of owner/id.
title (str) – Project title
objective (str, optional) – Short project objective.
summary (str, optional) – Long-form project summary.
tags (list, optional) – Project tags. Letters numbers and spaces
license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Project license
visibility ({'OPEN', 'PRIVATE'}) – Project visibility
files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties
linked_datasets (list of object, optional) – Initial set of linked datasets.
- Returns
message object
- Return type
object
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.update_project(... 'username/test-project',... tags=['demo','datadotworld'])
upload_files
(dataset_key, files, files_metadata={}, **kwargs)Upload one or more dataset files
- Parameters
dataset_key (str) – Dataset identifier, in the form of owner/id
files (list of str) – The list of names/paths for files stored in the local filesystem
expand_archives – Boolean value to indicate files should be expanded upon upload
files_metadata (dict optional) – Dict containing the name of files and metadata Uses file name as a dict containing File description, labels and source URLs to add or update
- Raises
RestApiException – If a server error occurs
Examples
>>> importdatadotworldasdw>>> api_client=dw.api_client()>>> api_client.upload_files(... 'username/test-dataset',... ['/my/local/example.csv'])