Community docs

Create a query

To write a SQL query against data in a project go to the upper left corner of the workspace and select SQL Query from the + Add menu:

Image_2019-09-18_at_10.50.33_AM.png

When the query editor comes up in the center panel of the screen, the Project Schema also comes up in the right side panel. If the project schema isn't showing, select the arrow to the right of the Run query button to expand the sidebar:

Image_2019-09-18_at_11.19.50_AM.png
Auto-complete of terms and objects

As you type your query, the editor suggests SQL terms--operators, aggregations, and functions--for you. Hitting 'enter' will autocomplete the highlighted suggestion in the list. Arrow down or up to choose another option:

Image_2019-09-18_at_11.48.00_AM.png
Click-to-copy for columns and tables

You have the option of typing in the table and column names for the data you are querying, but you can also use the click-to-copy feature on the project schema to copy the names for you so you don't have to worry about typos.

Image_2019-09-18_at_2.31.00_PM.png
Pop-up help text

Help text is provided for all SQL functions and aggregations when you hover over the function name:

Image_2019-09-18_at_11.52.24_AM.png
How to auto-format queries

While you can type on one line type your SQL query all on one line--it is perfectly legal and will run--it's good practice to split it up based on clauses and punctuation. The line breaks make it easier to see where you may have forgotten a comma or some other element of the query, and it also allows us to give you better error messages if there are problems with your query.

Here's an example of a query before and after running it through our auto-formatter:

Image_2019-09-18_at_2.04.12_PM.png
Image_2019-09-18_at_2.04.55_PM.png

To auto-format your query, use the keyboard shortcut Cmd + Option + L . If you forget the command there is a dropdown menu to the right of the Run query button which has a link to a list of all the keyboard shortcuts.

Image_2019-09-18_at_1.14.15_PM.png

From the dropdown menu you can also manage your query (rename, copy, etc.) as well as get to our SQL tutorial for additional help.

Image_2019-09-18_at_1.13.59_PM.png
Troubleshooting

If you try to auto-format your query and it doesn't work, that's a sign that you have an error in the query. If you run it you'll get an error message telling you what the problem is and where to find it:

Image_2019-09-18_at_2.18.18_PM.png

In the above example the query is written to search for all the records that have the value "won" in the deal_stage column in the sales_pipeline table. However the way the query is written it looks like it is searching for records where the value in the deal_stage column matches the value in another column named Won. For "won" to be recognized as a value, it has to be surrounded by double-quotes. The error message returned from running the query tells you that the problem is in the 4th line and the 36th character (column). The 36th character should have been the opening " on the 4th line:

Image_2019-09-18_at_2.46.44_PM.png
Saving queries

When your query is complete, save it for later use by selecting Save to the left of the Run query button. You'll be prompted to name the query, add a description, determine where to save it, and indicate who can see it.

It's good practice to give the query a short descriptive name for both your own later use and for use by others working on your project. Adding a description also makes your query more useable as the description shows up on the resource card returned from using search on data.world.

Saving a query to a dataset

Queries are saved to projects by default. The logic behind this structure is that datasets are for storing files and tables, and projects are for querying and analyzing those files and tables. A dataset is meant to be reused in multiple projects, and if queries are saved to it instead of to the projects using it then the dataset can rapidly fill with irrelevant queries making it difficult to use. However if the queries specific to a project are all stored in that project, the linked dataset remains clean and ready for reuse.

The reasoning above covers 80% of the use cases, but what about the times you really do want to save a query to a dataset? Maybe you want to clean up the data, join tables and preserve the lineage of the original tables for reference, or just use the query in multiple projects without having to rewrite it (you might even want to parameterize it). In those cases it is useful to be able to save your query to the dataset, and you can do that. After running your query, to save it to the dataset select the Save link and click the drop-down link to the right of the + New Project option. In addition to New project you'll also see the name of the dataset. Select it and the query will be saved to the dataset and you'll still be in an untitled, unsaved project:

Image_2019-10-16_at_5.10.12_PM.png

One thing about saving queries to the dataset instead of to the project is that queries saved to a dataset won't show up in the queries list of any project the dataset is used in. Instead they'll be displayed under the connected datasets info:

Image_2019-10-16_at_5.17.38_PM.png