Enterprise docs

Start a data project

In data.world, projects are where all querying, analysis and discussion of data takes place. Data in different datasets can be used for many different projects, but each project contains all and only the data that is relevant for that project. The information in a project can come from datasets, files attached directly to the project, insights written by the project's team members about the data and the project, and discussions about the project.

The idea behind a project is that all work and communication about that project done by any member of the project team is stored in the project, always accessible by all team members, and always the most up-to-date information.

To create a project, click on the + New icon in the top right of your menu bar and select New project.


On the Create a new project window you are prompted to select the owner of the project, the project name, and who can see the project. By default, if you are in an organization (also called a team), your organization is set as the owner of the project. You can choose another organization from the dropdown if you are in more than one or you can switch to personal account if you wish to be the owner as well as the creator of the project. Permission options to the project are either no one, the other members of your team (if the team is set to be the owner), or public to the data.world community:


Once your project has been created you are prompted to add an objective for it and given the opportunity to drag and drop files to the project or connect to a data source:


To link existing datasets to your project select Add data and you'll be taken to the Add data from anywhere dialog:


After you have selected Link a data.world dataset you can search for the dataset you want to add using the search bar. or you can scroll through your datasets, your bookmarked datasets, and community results:


More information about adding data to your project can be found in the article Connect data to your project.

If you have data that might be used on other data projects, we recommend adding the raw data to a dataset and then linking that data to a project where you will do further analysis. This will allow you to link and access the dataset from multiple projects without having to import multiple copies of the data. For more information on making a new dataset, see Create a datasetCreate a dataset

To create a new untitled project and go straight to the project workspace, you can click on the Explore this dataset button from the dataset overview page.

If you'd like to create a new project based off the dataset and go through the initial project creation steps (e.g. giving the project a name and permission level and optionally adding a description and other data), then click on the dropdown arrow on the right of the Explore this dataset button and choose Create a new project.

You can also connect the dataset to an existing project from this overview page. To do that, click on the dropdown arrow on the right of the Explore this dataset button and choose Connect to existing projects to select from a list of available projects.


You can add data to a project at any time - either when creating it or at a later time - in a few ways.

Linking datasets to an existing project

From the project overview page, click the Connect Datasets link within How do projects work? section, if available. Or from below the project description and summary section, choose Add data then data.world dataset.


You can also link datasets from the project workspace. From within the workspace, click the Add > Dataset buttonon the top left. Or from the home tab, click the Drag and drop, upload files or connect to a data source box.

Linking existing datasets

You can link datasets to a project when creating the project or later if you require additional data. For projects that already exist, you can link datasets from the project overview page and the project workspace.

Linking datasets during project creation

When creating a new project, you can click on the big Add Data prompt to connect any sort of data - including a linked dataset.


You will then have the option to Link a data.world dataset in the new window that opens:


After selecting that option, you can use the search bar and the tabs below to find your datasets in your resources, among your bookmarks, or from the data.world community at large. Click on the question mark icon for some hints or see Using search for advanced search tips.Advanced search


Simply click the Link button to the right of the dataset you would like to add. You can link as many datasets as you'd like. If you accidentally link one that you don't want, hover the mouse over the Linked button - it will changed to an Unlink button which you can click to remove the dataset from the project.

There are many great projects and datasets on data.world, and it's likely that at some point you are going to want to use data from them in your own work. There are two different ways to reuse data on data.world: linking, and downloading and re-uploading. Which option you choose depends on a few factors:

  • Is the source data in a project or a dataset?

  • How well does the source data meet your needs?

  • Is the data either streamed or regularly updated?

If the data is in a dataset (as opposed to in a project), is well-documented, concise, and clean, you may very well want to link to it. However if you need to make changes to it, you'll need to download it, edit it, and re-upload it.

Some reasons you might choose to link to the dataset include:

  • You don't need to make any additions to the source dataset (e.g., adding extra columns with data.world linked-data fact tables)

  • The source dataset is really clean so you don't need to go in and clean it up

  • The dataset is well-documented with a good dataset summary, references to the original source, and a complete data dictionary

  • The dataset is automatically updated from an external source

Some reasons you might choose to download and re-upload data include:

  • You want to add columns from data.world linked-data fact tables (e.g., US census region, currencies, ICD10 medical codes, etc.,)

  • You only want to use a subset of the files in a dataset and don't want the rest of the files adding unnecessary complexity to your dataset or project

  • The data files would benefit from cleaning for clarity (e.g., removing blank columns, removing columns containing a single value, changing file or column names, etc.,)

  • The data files only exist in a project and not in a dataset

  • The data dictionary and/or dataset summary are incomplete and you do not have write privileges to the dataset.

The table below summarizes the differences between linking and downloading a file and re-uploading a data file:

Linked vs Reimported Data



Can add to a project



Can add to a dataset


Can extend data with data.world linked-data fact tables


Can edit data dictionary and dataset summary


Must recreate the dataset summary and the data dictionary for every file in the dataset


Do not have to use all of the files in a dataset


Can reuse data dictionary and other metadata


Automatically updated from original dataset


Must include all the files in a given dataset


Uploading files and data directly to your data project is great for materials specific to the project such as images, documentation, and code. You can also add data this way, but because you cannot link projects to other projects, data is generally best placed within datasets to make reuse easier. You can add files directly to your project by clicking the Add button next to your Project directory header and selecting Project file:


Files added to the project are not visible anywhere else on data.world. If there are files that you want to access from more than one project, upload them to a dataset instead of a project. For information on other ways to add data to your project see the article How to get your data into data.world.How to get your data into data.world