Skip to main content

Advanced work with queries

In the tutorial Query your data we introduced the query languages used on data.world and the way queries are written and formatted in our query editor. In this tutorial we go to the next level as we discuss the different ways queries can be used. If you need help learning a query language in order to write queries, see our SQL documentation and tutorial, or our SPARQL documentation and tutorial.

Objectives

After working through the tutorial you should be able to:

  • Use other people's queries

  • Save a query to a dataset

  • Make a template from a query

  • Download query results

  • Save query results to a dataset or project

  • Find information on how to work with third-party applications

Requirements

To complete this tutorial you need to have:

  • A data.world login (available for free here if you don't have one).

  • Your own tutorial project (you must create this yourself--it cannot be downloaded).

  • The Bee Colony Statistics dataset linked to your project

If you need help creating the project or linking the dataset to it, detailed instructions are in the tutorial Create a project to work with data.

Introduction

Note

If you prefer to go straight to the exercises, click here

Use other people's queries

As you work in projects with other people, you might find a query that you want to use and edit, but it's not your query so you don't have permissions to save changes to it. The easiest thing to do is to make a copy of the query by selecting the three dots to the right of the query name in the project workspace and selecting Save a copy:

Screen_Shot_2020-02-03_at_5.21.51_PM.png

You'll be prompted to name the query and write a description. For tracking lineage it's a good idea to include the name of the original query in the description. Other save options include where--the current project or the original dataset--and who can see the new query (hidden by the location selection in the image below:

Screen_Shot_2020-02-03_at_5.24.57_PM.png

Note that if the original query is a project query (listed in the queries at the bottom of the Project directory in the left sidebar), you can only save the query to the original project--if you want to use it in a different project, you will have to copy the query text and paste it into a new query window in the other project. Here, too, it's a good idea to copy the original source of the query in case you want to go back to it later.

Save a query to a dataset

Most of the time you'll want your queries to be associated with a project. However if the query would be useful to others who create projects from a dataset, it might make more sense to save it directly to the dataset. Saving it to the dataset and sharing it with everyone enables anyone with access to the dataset to use it in their own projects. It's also the easiest way for others to save the query to their own projects using the dataset. One note: queries saved to datasets show up in the Connected datasets section of the Project directory in the left sidebar--not under Queries:

To save a query to a dataset, click the menu icon to the right of the field Where will the query be saved? and select the dataset where you want to save it:

Screen_Shot_2020-02-02_at_2.25.47_PM.png

Make a template from a query

Another way to use a query is to make a template from it so you can change the input strings to various parts of it. For example, if we had a query that we could turn the query we just saved as Bee colony census by state and year into a query template by selecting the New template link at the top right of the query panel:

Screen_Shot_2020-02-03_at_5.27.58_PM.png

A dialog window will pop up giving you basic instructions to how query templates work and linking you to the documentation for query templates:

query_template.png

Click the Add a sample statement button which will add a sample of the DECLARE statement used in query templates, a sample comment, and a link to the documentation to the top of your query window:

Screen_Shot_2020-02-03_at_5.29.13_PM.png

With just a couple of modifications to your SQL you can have a query that other users can use to input values for a field without ever having to know SQL themselves. For example, to change the query above into a query template that returns the results for any specified census year, all you need to do is:

  • Insert a new line above the DECLARE statement and enter a comment holding instructions for what to enter in the variable field

  • Change the DECLARE clause so that the variable ?year is defined as an integer

  • Modify the WHERE clause to match where year = the year variable

The contents of the query editor will look like this:

Screen_Shot_2020-02-25_at_8.34.47_PM.png

You can preview the query to see what other users will see and to test the query yourself by selecting the Preview query template link in the upper right corner of the query panel. Then fill in a value and hit Run query:

Screen_Shot_2020-02-03_at_5.41.22_PM.png

Download query results

Another way to use a query is to download the results from it in either a csv or an xlsx file. This option is useful if you want to work with your data file in a spreadsheet program. After running your query, click the Download link below the query window and select the option you want:

Screen_Shot_2020-02-03_at_5.42.10_PM.png

Save results to a dataset or project

Another way to work with your query results is to save them as a new table in a dataset or project. For those familiar with creating a View in a SQL database, this is quite similar. When the results are saved into a table they can be queried just like any other file--eliminating the need for extremely complex queries. You can save the table in your current project, in an existing dataset, or in a new dataset. Select Save to dataset or project from the dropdown menu on Download, name your file, and either choose a project or dataset from the dropdown menu, or begin typing the name of another one:

save_query_to_file.png

There are other reasons you might want to save the results of your query in a new tabular file. See the article on working with query results for more information.

Two other ways you can use the results of a query are to make them available for download from a URL, or to embed them on a Markdown page. Both options are found in the Download menu under Copy URL or embed code. The URL is a download link for a csv file of the results, and the embed code option renders the results table in a Markdown page.

Connect to third-party apps

There are many applications you can use to analyze and display the results of your queries, and we make it easy to work with your favorites. You can find out about the applications we integrate with on our integrations page. If you already have an integration configured that you want to use or want to integrate a new application, the name of the first application in your list of integrations will show up as a menu to the right of the Download button:

Screen_Shot_2020-02-03_at_6.14.02_PM.png

Exercises

Save a copy of a query

  1. Open the workspace of your tutorial exercises project.

  2. Expand the view of the contents for the Bee Colony Statistics dataset and make a copy of the query Bee colony census by year for Montana (click on the three dots to the right of the query name in the left sidebar and select Save a copy).

  3. Save the query to your project with the name Bee colony census by year for individual states.

Make a template from a query

  1. With the query Bee colony census by year for individual states open, select the New template link from the top right of the query editor

  2. select Add a sample statement.

  3. Use cut and paste to move the first line of the modified query (the DECLARE statement) down below the two comment lines.

  4. Change the text in the first comment line to State names must be entered in ALL CAPS (leave the # at the beginning of the line to denote that it's a comment, not a command), and delete the entire second comment line (# and all).

  5. Replace ?column in the DECLARE statement with ?state.

  6. Replace "Montana" in the WHERE clause of the query with ?state.

  7. Select the Preview query template link in the upper right of the query editor. You should see a screen like this:

    Screen_Shot_2020-02-04_at_11.40.58_AM.png
  8. Test your query by entering a state name in the blank field and running the query (note the comment above the selection field).

  9. Select edit query if you need to make changes.

  10. If your query returns the correct results, save it.

  11. Click here to see what the query should look like.

Save query results to a project

  1. Using the Preview query screen on the query Bee colony census by year for individual states (open at the end of the last exercise), enter a state name and run the query.

  2. Select Save to dataset or project from the Download link at the bottom of the query editor.

  3. Change the file name to Bee colony census by year for <your state name> (e.g., Bee colony census by year for Hawaii), and save the file to your project.

Conclusion

The variety of ways to access data and the ease of querying it are some of the most powerful features of data.world. We introduced the basics of querying in the tutorial Query your data, and in this tutorial we dove into the ways you can use queries to get the most out of your data. The best way to become comfortable with all the options available when querying data is to practice. If you haven't looked through our SQL or SPARQL documentation yet, now would be a good time to do it to get a feel for just how much you can do. If you're new to SQL, try our SQL tutorial to get the most out of querying with it on data.world.

References