Docs portal

About projects

Projects bring datasets together with documentation and analysis. This is where work and collaboration happen. A project, as the name implies, likely has a beginning and an end. Data in it is shared and analyzed, and insights are derived from the analysis and written up in the project.

Projects are where all querying, analysis and discussion of data takes place. Data in different datasets can be used for many different projects, but each project contains all and only the data that is relevant for that project. The information in a project can come from datasets, files attached directly to the project, insights written by the project's team members about the data and the project, and discussions about the project.

The biggest difference between a dataset and a project is that datasets can be linked to and included in projects, but projects cannot be linked to or included in other projects or datasets--nor can the files that are added directly to a project. With a project you can run queries against the data, analyze it, share it and create charts and visualizations from it.

Although you get an option to add files directly in a project, as a best practice, it is recommended that you create datasets and add files in them and not directly in a project. The datasets can then be linked to the projects.

Start with a question or task

Through hundreds of interviews with people who work with data, we have found that most work stems from a question rather than a particular set of data. It is the question that drives the search for relevant data, generating insights, and presenting reproducible findings, yet there hasn't been a great way to keep all your work in one place--not to mention collaborate easily with others on the project.

Projects help you to capture and share the most important aspects of your work as the project unfolds, even across multiple datasets, from question to conclusion.

With data projects you can:

  • Keep all project data in a single place by linking datasets or adding data directly in the project

  • Add people as contributors or viewers to your project and see the project activity stream.

  • Use the workspace to document, explore, query and chart any of the local or linked data and files.

  • Share and discuss insights

Sample projects on data.world community

Check out some of the following sample projects:

Start with a question or task

Through hundreds of interviews with people who work with data, we have found that most work stems from a question rather than a particular set of data. It's the question that drives the search for relevant data, generating insights, and presenting reproducible findings, yet there hasn't been a great way to keep all your work in one place--not to mention collaborate easily with others on the project.

That's what we hope to address with Projects on data.world. Projects help you to capture and share the most important aspects of your work as the project unfolds, even across multiple datasets, from question to conclusion.

With data projects you can:

Check out some of our favorite projects below, or jump straight into creating your own!

Once you've explored projects, we'd love to hear from you! Please contact us with any questions or feedback, or join our Slack community to connect with the data.world team and it's members.

Page layout

When you land on a project workspace you can tell what's showing in the main area of the screen by what's highlighted in the Project directory in the left sidebar:

Image_2019-09-09_at_1.14.02_PM.png

The project workspace has five main parts:

  • The left sidebar is the Project directory

  • Underneath the header in the center are the tabs and actions for the current object

  • Beneath the tabs In the center is the object viewer

  • In the right sidebar (when visible) on the top is the About section with information about the currently selected object

  • For some objects, the bottom of the right sidebar contains the Project schema

Image_2019-09-16_at_10.47.02_AM.png
Project Directory
Image_2019-09-09_at_1.19.13_PM.png

The Project directory is the navigation area of the workspace. At the top is the + Add dropdown which is where you can add files to the project, link datasets to it, add posts or insights, and add new SQL or SPARQL queries.

Screen_Shot_2019-08-15_at_10.09.58_PM.png

Below the + Add button are a link to the Home tab, and the main project files (Project summary and Data dictionary). Information about the project summary and the data dictionary can be found in the articles Description and Summary, and Data dictionary.

The next section of the Project directory is for project files. Project files are any data resources uploaded or saved directly to the Project--not to a dataset.

Below the project files are connected datasets. The datasets used in this project may also be used in and linked to other projects. When there are changes to the underlying dataset, all projects using the dataset also update. All files and queries associated with a dataset are linked to the project and can be used in it:

Screen_Shot_2019-09-09_at_2.12.24_PM.png

More information on how to create a dataset is in the article on creating a dataset.

The last two sections of the project directory, queries and insights, are also specific to the project. The Queries section contains all the stored queries in the project. Learn how to write and use queries in Querying data of the . We also have a SQL tutorial and a SPARQL tutorial to help you learn or improve your query language skills.

Insights are findings, conclusions, or interesting points for discussion about your project. They allow you to capture conclusions from your work, packaging them up in a way that quickly communicates a nugget of information, while giving the viewer the tools they need to dig down into your methods and sources. See Posting insights for details on insights.

Tabs and actions

Screen_Shot_2019-08-16_at_9.21.28_AM.pngWhenever you select an object or the + Add button from the Project directory, a new tab opens up under the header bar with a preview of that object. Underneath the tabs are an icon indicating what type of object is shown (the name of the object shown and to the right of the name are any actions that can be taken with that object. and whether it is shared with all users of the project or private only to you. Further to the right is a link to create a new query template, or parameterized query. You can learn more about parameterized queries from the article Using query templates. The last arrow icon is used to either expand or collapse the about and project schema panel. For more information about query-specific actions, see the article Query data.

Object viewer

The central are of the workspace is devoted to to viewing or interacting with the object selected in the project directory. The viewer renders previews of most file types (for a complete list of files supported by preview see the article Supported file types. When the object is a query, the viewer is split into two parts: the query editor at the top and the query results at the bottom. The query editor is where you compose and run queries against your data. In addition to typing in your text, data.world's query editor also provides auto-complete and auto-format of commands, columns and and tables:

Screen_Shot_2019-08-15_at_1.16.45_PM.png

The bottom center of the screen is used to display results of queries or error messages when queries are not written correctly:

Screen_Shot_2019-08-15_at_1.25.58_PM.png
About

In the upper right corner of the project workspace is a pane with the information about the object highlighted in the project directory. It is used for queries and files to give you more information about them that might be relevant for their use:

Image_2019-09-18_at_9.27.52_AM.png
Project schema

The project schema shows up at the bottom of the right sidebar whenever the object displayed in the main section is a query. It contains a list of the queryable entities in the project (datasets, tables, and columns). Items can be expanded or collapsed, and selecting either a column or table copies its name to the clipboard for easily pasting into the query editor:

Screen_Shot_2019-08-15_at_1.58.54_PM.png