Skip to main content

Working with data

Once your data is added to the platform, or you've utilized the robust search capabilities to find and discover open or team-specific datasets you have access to, you'll probably want to do something with them! This is where projects come into play.

While you can do much of the same things in a dataset, the real place to work with data on data.world is in a project. Projects allow you to connect to data from various datasets to combine and analyze them, document your analysis, post insights, share with others, and collaborate through discussions.

To kick off your project, select the + Add link on the top right of your screen and choose New project:

Screen_Shot_2018-08-23_at_12.39.25_PM.png

From there you will be taken to a basic Create a new project page where you'll configure:

  • Owner (if prompted): if your project is for an organization, we recommend creating it under the team account to keep your organization's work within a single library.

  • Project name

  • Project objective: projects are best when they start with a clear question or goal.

  • Project permissions:

    • Private, - only accessible to you.

    • Organization - allows your team to view the project.

    • Open - available to the data.world community.

    If you’re unsure which permissions to choose, we recommend starting private and adding contributors and publishing out as you're ready.

Screen_Shot_2018-09-17_at_10.15.54_AM.png

Once your project has been created, add or link data to your project via:

  1. Finding and linking in existing data.world datasets

  2. Creating your own datasets and linking them into the project

  3. Adding data directly to the project (limit to data that is unique to that project and wouldn't make sense to reuse elsewhere)

To connect or add data, just click the Add data button from your project page or workspace and choose data.world dataset to browse for existing datasets, or New file to add a local project file:

360011757934-mceclip17.png

When you pull an existing dataset into your project, you're really linking to its original location. This means that all updates to the original data will be populated to your project automatically. It also means that you can't modify the underlying data or metadata within the project interface, however you can use it in queries to create the desired output.

When adding new files, you can add data from your computer or cloud storage, urls, and integrations--just as we covered in adding files to datasets.

Projects have metadata the same as datasets do, so document them in the same way as a dataset. For projects, the summary can be used to document your project and/or findings, keep track of your questions, to-dos, and further document your sources. Here are a couple of examples to demonstrate how your summary can be utilized: Exploring THOR and How is the federal government fighting the opioid epidemic?