Working With Dates
An introduction to working with dates.
SQL has several built-in functions for manipulating date data in fields. Some of the more common ones are DATE_DIFF
, DATE_ADD
, NOW
, and DATE_PART
For a full list of date functions see the reference section.
DATE_DIFF
DATE_DIFF
returns the difference between two date or datetime values as
a given unit of duration. It takes three arguments: the start date, the end
date, and the unit of duration (one of “year”, “decade”, “century”, “quarter”,
“month”, “week”, “day”, “hour”, “minute”, “second”, or “millisecond”).
If we wanted to know how many deals took more than 135 days to close in our deal pipeline we could run the query:
SELECT account,
sales_agent,
DATE_DIFF(created_date, close_date, "day") AS `days to close`
FROM sales_pipeline
WHERE DATE_DIFF(created_date, close_date, "day") > 135
And the results would look like:
account | sales_agent | days to close |
---|---|---|
Isdom | Darcel Schlecht | 137 |
Warephase | Darcel Schlecht | 138 |
Funholding | Niesha Huffines | 136 |
Gogozoom | Anna Snelling | 136 |
Hottechi | Moses Frase | 137 |
Zotware | Garret Kinder | 137 |
Codehow | Gladys Colclough | 136 |
Vehement Capital Partners | Cassey Cress | 138 |
We could also combine DATE_DIFF
with AVG
to find out the average number of
days deals in the pipeline took to close:
SELECT AVG(DATE_DIFF(created_date, close_date, "day"))
FROM sales_pipeline
avg |
---|
49.99 |
DATE_ADD
DATE_ADD
takes a date and adds a specified amount of time to it. As with
DATE_DIFF
, the time added can be any of the following: “year”, “decade”,
“century”, “quarter”, “month”, “week”, “day”, “hour”, “minute”, “second”, or
“millisecond”.
The number we got for average days to close from the last query can be combined with the DATE_ADD function to see which deals took longer than average to close:
SELECT account,
sales_agent,
DATE_DIFF(created_date, close_date, "day") AS `days to close`,
created_date,
close_date
FROM sales_pipeline
WHERE DATE_ADD(sales_pipeline.created_date, 50, "day") < close_date
account | sales_agent | days to close | created_date | close_date |
---|---|---|---|---|
Cancity | Moses Frase | 132 | 2016-10-20 | 2017-03-01 |
Isdom | Darcel Schlecht | 137 | 2016-10-25 | 2017-03-11 |
Bioholding | Anna Snelling | 126 | 2016-11-04 | 2017-03-10 |
Donquadtech | Violet Mclelland | 101 | 2016-11-25 | 2017-03-06 |
The New York Inquirer | Boris Faz | 89 | 2017-02-22 | 2017-05-22 |
Vehement Capital Partners | Corliss Cosme | 106 | 2017-02-22 | 2017-06-08 |
Zencorporation | Corliss Cosme | 126 | 2017-02-22 | 2017-06-28 |
Sumace | Daniell Hammack | 84 | 2017-02-22 | 2017-05-17 |
NOW
The NOW
function returns the current date and time providing a timestamp to
a query:
SELECT NOW()
now |
---|
2018-01-10T16:06:20.623Z |
NOW
can also be combined with other date functions like DATE_ADD and
DATE_DIFF. For example if we wanted to know the age in days of all the deals we
have in progress we could run the following query:
SELECT account,
sales_agent,
DATE_DIFF(created_date, NOW(), "day") AS `days open`,
created_date
FROM sales_pipeline
WHERE sales_pipeline.deal_stage = "In Progress"
AND account IS NOT NULL
account | sales_agent | days open | created_date |
---|---|---|---|
Fasehatice | Daniell Hammack | 322 | 2017-02-22 |
Rundofase | Vicki Laflamme | 322 | 2017-02-22 |
Opentech | Markita Hansen | 409 | 2016-11-27 |
Blackzim | Kary Hendrixson | 318 | 2017-02-26 |
Bluth Company | Maureen Marcano | 318 | 2017-02-26 |
Y-corporation | Daniell Hammack | 317 | 2017-02-27 |
Bioplex | Maureen Marcano | 316 | 2017-02-28 |
Kinnamplus | Boris Faz | 315 | 2017-03-01 |
DATE_PART
DATE_PART
is a function that returns a numeric value for part of a date or
datetime value. Like DATE_ADD
and DATE_DIFF
, DATE_PART
takes different
units as one of its arguments. The unit types recognized by DATE_PART
are
“timezone”, “timezonehour”, “timezoneminute”, “year”, “decade”, “century”,
“quarter”, “month”, “week”, “day”, “dayofweek”, “dayofyear”, “hour”, “minute”,
“second”, and “millisecond”.
Though there are fields for create date and close date in our dataset, there isn’t one for the quarter in which deals were opened or closed. If we wanted to know in which quarters the biggest deals were closed, we could run a query like this:
SELECT account,
sales_agent,
DATE_PART("quarter", close_date) AS qtr,
YEAR(sales_pipeline.close_date),
close_value
FROM sales_pipeline
WHERE sales_pipeline.deal_stage = "Won"
ORDER BY close_value DESC
The function YEAR
which was used in the last query is one
of the other datetime functions explicated in the reference section.
The results of which look like:
account | sales_agent | qtr | year | close_value |
---|---|---|---|---|
Groovestreet | Rosalina Dieter | 2 | 2017 | 30,288 |
Goodsilron | Markita Hansen | 2 | 2017 | 29,617 |
Xx-holding | Elease Gluck | 4 | 2017 | 29,220 |
Cheers | Markita Hansen | 3 | 2017 | 29,166 |
Cheers | Elease Gluck | 4 | 2017 | 27,971 |
Labdrill | Elease Gluck | 2 | 2017 | 27,385 |
Xx-holding | Rosalina Dieter | 4 | 2017 | 26,186 |
Zoomit | Elease Gluck | 1 | 2017 | 25,897 |
Kan-code | Rosalina Dieter | 3 | 2017 | 25,791 |
Cheers | Elease Gluck | 2 | 2017 | 25,464 |
Y-corporation | Rosalina Dieter | 2 | 2017 | 25,288 |
Plexzap | Elease Gluck | 2 | 2017 | 24,949 |
Exercises (Continued from previous section)
There are two ways to do these exercises. The first is to use the “Try query” links to test your queries without saving them. The second is to create a data.world project and save your queries to it. If you are reading this documentation and completing the exercises as a tutorial, you will need to create your own project to save your work. Details and instructions are in the SQL tutorial which has instructions for setting up your project and links to all the current exercises.
Exercise 32
Write a query against the patients table that returns the patient, birthdate, and deathdate, and the patients age in years as age at death. The results should only include patients who have died, and they should be ordered by the age at death, oldest to youngest.
Hint
SELECT
the columns patient,
birthdate,
deathdate,
and find the age of the person at death using DATE_DIFF
on the columns (birthdate,
and deathdate,
with the value rendered in “years”)
. Label the results column AS
`age at death`
. Select the columns FROM
the patients
table and only return records WHERE
the deathdate
IS NOT NULL
. ORDER
your results BY
`age at death`
.
An introduction to working with arrays.