API docs


What are valid table names to use in SQL queries?

As files are added to datasets, data.world extracts and normalizes all the structured content in them, and derives a schema (tables and columns) that can be used for querying.

To list all the tables in a dataset, you can query the Tables metadata table. Specifically, you can run SELECT * FROM Tables, using the POST:/sql/{owner}/{id} endpoint.

To list all the columns, the process is similar, but uses TableColumns instead of Tables—i.e. SELECT * FROM TableColumns.

What is the difference between datasets and projects?

For simplicity, think of datasets as repositories of data. That’s normally where you’ll collect data files, in addition to other files that belong with them (documentation, scripts, etc).

Think of data projects as a tool to coordinate and document the use of data from datasets in fulfilling the project’s objective. Data projects start with a task, a question or a hypothesis and end with a conclusion supported by assets and insights collected along the way.

The primary source of data for data projects are linked datasets, however, every data project includes a default dataset that you can use to store files assets that only make sense in the context of the data project and don’t need a separate dataset for themselves.

From an API perspective, all endpoints designed to work with datasets work with data projects too (they will act on the project’s default dataset). In addition, specific projects APIs allow you to perform operations that apply to projects only, such as creating insights and linking datasets.

For more details, see When to use a dataset and when to use a project.When to use a dataset and when to use a project

Can I create an insight using an image from a dataset or project?

Yes, you can address images that live in datasets using the following URL pattern: https://data.world/api/{ownerId}/dataset/{datasetId}/file/raw/{urlEncodedFileName}

For example, imagine you wanted to use a file with a name language.pnfollowing file in an insight. In this case, the parameters would be:

ownerId: jenka13all

datasetId: lara-hotel-reviews

urlEncodedFileName: languages.png

To create the insight, you would then call POST:/insights/jenka13all/lara-hotel-reviews with a payload that may look like:

    "title": "The majority of the reviews were written in English.",
    "description": "That the top languages include some more untypical languages warrants a closer look at the data and a second round of classification...",
    "body": {
        "imageUrl": "https://data.world/api/jenka13all/dataset/lara-hotel-reviews/file/raw/languages.png"

Why don’t I see the data I have appended to my stream?

data.world separates the act of appending to a stream from actually processing that data for performance optimization reasons. By default, data is processed daily, but that setting can be changed on a per dataset basis. In addition, data can be processed on demand. For example, if your data is appended in batches, you can call POST:/datasets/{owner}/{id}/sync to force processing of the data at the end of a given batch.

What happens if I make changes to my application?

We understand that changes happen - if you make changes that might interrupt the way your integration works, you’ll need to update any applicable responses in the initial submission form you used to get featured. We’ll contact you if we have any questions or if these changes need additional review.

Can my application or integration interact with private datasets?

Yes, it can! The only limits on private datasets are the general limitations about the number of available private projects/datasets allowed to a user in data.world (3) and your specific application/program.


Keep in mind that only specific users will be able to view & access private datasets.

Is it better to download raw data, or queries?

When downloading data through the API, queries can be tremendously helpful. Not only will you be able to tailor the downloaded data to your specific needs, using queries will prevent any slowness from large datasets.