SQL has several built-in functions for manipulating date data in fields. Some of the more common ones are
For a full list of date functions see the reference section.
DATE_DIFF returns the difference between two date or datetime values as
a given unit of duration. It takes three arguments: the start date, the end
date, and the unit of duration (one of “year”, “decade”, “century”, “quarter”,
“month”, “week”, “day”, “hour”, “minute”, “second”, or “millisecond”).
If we wanted to know how many deals took more than 135 days to close in our deal pipeline we could run the query:
SELECT account, sales_agent, DATE_DIFF(created_date, close_date, "day") AS `days to close` FROM sales_pipeline WHERE DATE_DIFF(created_date, close_date, "day") > 135
And the results would look like:
|account||sales_agent||days to close|
|Vehement Capital Partners||Cassey Cress||138|
We could also combine
AVG to find out the average number of
days deals in the pipeline took to close:
SELECT AVG(DATE_DIFF(created_date, close_date, "day")) FROM sales_pipeline
DATE_ADD takes a date and adds a specified amount of time to it. As with
DATE_DIFF, the time added can be any of the following: “year”, “decade”,
“century”, “quarter”, “month”, “week”, “day”, “hour”, “minute”, “second”, or
The number we got for average days to close from the last query can be combined with the DATE_ADD function to see which deals took longer than average to close:
SELECT account, sales_agent, DATE_DIFF(created_date, close_date, "day") AS `days to close`, created_date, close_date FROM sales_pipeline WHERE DATE_ADD(sales_pipeline.created_date, 50, "day") < close_date
|account||sales_agent||days to close||created_date||close_date|
|The New York Inquirer||Boris Faz||89||2017-02-22||2017-05-22|
|Vehement Capital Partners||Corliss Cosme||106||2017-02-22||2017-06-08|
NOW can also be combined with other date functions like DATE_ADD and
DATE_DIFF. For example if we wanted to know the age in days of all the deals we
have in progress we could run the following query:
SELECT account, sales_agent, DATE_DIFF(created_date, NOW(), "day") AS `days open`, created_date FROM sales_pipeline WHERE sales_pipeline.deal_stage = "In Progress" AND account IS NOT NULL
|Bluth Company||Maureen Marcano||318||2017-02-26|
DATE_PART is a function that returns a numeric value for part of a date or
datetime value. Like
DATE_PART takes different
units as one of its arguments. The unit types recognized by
“timezone”, “timezonehour”, “timezoneminute”, “year”, “decade”, “century”,
“quarter”, “month”, “week”, “day”, “dayofweek”, “dayofyear”, “hour”, “minute”,
“second”, and “millisecond”.
Though there are fields for create date and close date in our dataset, there isn’t one for the quarter in which deals were opened or closed. If we wanted to know in which quarters the biggest deals were closed, we could run a query like this:
SELECT account, sales_agent, DATE_PART("quarter", close_date) AS qtr, YEAR(sales_pipeline.close_date), close_value FROM sales_pipeline WHERE sales_pipeline.deal_stage = "Won" ORDER BY close_value DESC
YEAR which was used in the last query is one
of the other datetime functions explicated in the reference section.
The results of which look like:
Exercises (Continued from previous section)
There are two ways to do these exercises. The first is to use the “Try query” links to test your queries without saving them. The second is to create a data.world project and save your queries to it. If you are reading this documentation and completing the exercises as a tutorial, you will need to create your own project to save your work. Details and instructions are in the SQL tutorial which has instructions for setting up your project and links to all the current exercises.
Write a query against the patients table that returns the patient, birthdate, and deathdate, and the patients age in years as age at death. The results should only include patients who have died, and they should be ordered by the age at death, oldest to youngest.
SELECT the columns
deathdate, and find the age of the person at death using
DATE_DIFF on the columns
deathdate, with the value rendered in
“years”). Label the results column
`age at death`. Select the columns
patients table and only return records
IS NOT NULL.
ORDER your results
`age at death`.
An introduction to working with arrays.