Docs portal

Base platform quickstart

At data.world we offer organizations and teams private, secure environments in which they can collaborate together. In using data.world, you and your team can put data and insights in the hands of those who need them while keeping valuable work from getting lost in inboxes or ad hoc conversations. As a result, your team's analysis and work will be more reproducible, reusable, and available for further collaboration.

data.world supports the largest open data community in the world, where we’re excited to connect members with a vast collection of scientific research, government, and demographic data, as well as other members who are interested in similar data work so they can join forces to solve real problems faster.

Regardless if you’re working in an open or closed setting, data on data.world can be brought together in one of two ways:

  1. Datasets help make your data accessible, reusable, and understandable. They contain the data and any additional metadata, scripts, or files that are related to that data. Datasets are the building blocks of data projects.

  2. Projects help organize your data, documentation, and output in one place when working on a specific question, project, or analysis to create reproducible work outside of the black box.

In this guide, we’ll walk through the basics of data.world, covering features and functionality to take you through your first data project from start to finish. Before getting started, ensure you’ve setup your account and joined any organizations you’ve been invited to, and next we’ll walk through finding and adding data.

Connect your data

You can find thousands of free, open data resources on data.world to work with, but quite often you’re going to bring in your own data to use.

When adding data to data.world, you’ll typically want to create a dataset. A dataset is simply a repository of data including data files and associated metadata, documentation, scripts, and any other supporting assets that should be stored alongside the data. Datasets can be created manually, which is what we’ll walk through here, and we also have automated options for working with larger datastores (contact us for more details).

To create a dataset, click on the + New dropdown on the right side of the header bar and select New dataset:

360011707554-mceclip0.png

After you've opened the new dataset you will:

  • Add a title (up to 60 characters)

  • Select an owner (if prompted) - if your dataset is for an organization, we recommend creating it under the team account to keep your organization's work within a single library.

  • Choose the visibility of your dataset:

    • Private - only accessible to you,

    • Organization - allows your team to view the project

    • Open - available to the data.world community

If you’re unsure which permissions to choose, we recommend starting private and adding contributors and increasing visibility as you go.

360011707654-mceclip1.png

Next, either drag and drop one or more files into the add data box, select the Add data button for additional source options, or save your dataset and add data at a later time.

There are multiple ways to add data and connect your data sources:

  • Upload from your computer, or select cloud storage services (Box, Dropbox, Google Drive)

  • Pull directly from URL or API

  • Connect your data via an integration (check out our super connectors if you don't see your datastore listed)

To upload files from your computer either drag and drop the file(s) from your hard drive to the Add data window or select the Add data button for more options:

360011756153-mceclip2.png

So that data owners can fully document their data, data.world supports all file types. Use a script to clean the raw data? Upload it so others can see and build off your work. See the article on file types for more information about how data.world handles different file formats, and when you're ready, select the Create dataset button at the bottom of the page to continue.