Working With Dates

An introduction to working with dates.

SQL has several built-in functions for manipulating date data in fields. Some of the more common ones are DATE_DIFF, DATE_ADD, NOW, and DATE_PART

For a full list of date functions see the reference section.

DATE_DIFF

DATE_DIFF returns the difference between two date or datetime values as a given unit of duration. It takes three arguments: the start date, the end date, and the unit of duration (one of “year”, “decade”, “century”, “quarter”, “month”, “week”, “day”, “hour”, “minute”, “second”, or “millisecond”).

If we wanted to know how many deals took more than 135 days to close in our deal pipeline we could run the query:

SELECT account,
       sales_agent,
       DATE_DIFF(created_date, close_date, "day") AS `days to close`
  FROM sales_pipeline
 WHERE DATE_DIFF(created_date, close_date, "day") > 135

And the results would look like:

account sales_agent days to close
Isdom Darcel Schlecht 137
Warephase Darcel Schlecht 138
Funholding Niesha Huffines 136
Gogozoom Anna Snelling 136
Hottechi Moses Frase 137
Zotware Garret Kinder 137
Codehow Gladys Colclough 136
Vehement Capital Partners Cassey Cress 138

We could also combine DATE_DIFF with AVG to find out the average number of days deals in the pipeline took to close:

SELECT AVG(DATE_DIFF(created_date, close_date, "day"))
  FROM sales_pipeline
avg
49.99

DATE_ADD

DATE_ADD takes a date and adds a specified amount of time to it. As with DATE_DIFF, the time added can be any of the following: “year”, “decade”, “century”, “quarter”, “month”, “week”, “day”, “hour”, “minute”, “second”, or “millisecond”.

The number we got for average days to close from the last query can be combined with the DATE_ADD function to see which deals took longer than average to close:

SELECT account,
      sales_agent,
      DATE_DIFF(created_date, close_date, "day") AS `days to close`,
      created_date,
      close_date
 FROM sales_pipeline
WHERE DATE_ADD(sales_pipeline.created_date, 50, "day") < close_date
account sales_agent days to close created_date close_date
Cancity Moses Frase 132 2016-10-20 2017-03-01
Isdom Darcel Schlecht 137 2016-10-25 2017-03-11
Bioholding Anna Snelling 126 2016-11-04 2017-03-10
Donquadtech Violet Mclelland 101 2016-11-25 2017-03-06
The New York Inquirer Boris Faz 89 2017-02-22 2017-05-22
Vehement Capital Partners Corliss Cosme 106 2017-02-22 2017-06-08
Zencorporation Corliss Cosme 126 2017-02-22 2017-06-28
Sumace Daniell Hammack 84 2017-02-22 2017-05-17

NOW

The NOWfunction returns the current date and time providing a timestamp to a query:

SELECT NOW()
now
2018-01-10T16:06:20.623Z

NOW can also be combined with other date functions like DATE_ADD and DATE_DIFF. For example if we wanted to know the age in days of all the deals we have in progress we could run the following query:

SELECT account,
       sales_agent,
       DATE_DIFF(created_date, NOW(), "day") AS `days open`,
       created_date
  FROM sales_pipeline
 WHERE sales_pipeline.deal_stage = "In Progress"
       AND account IS NOT NULL
account sales_agent days open created_date
Fasehatice Daniell Hammack 322 2017-02-22
Rundofase Vicki Laflamme 322 2017-02-22
Opentech Markita Hansen 409 2016-11-27
Blackzim Kary Hendrixson 318 2017-02-26
Bluth Company Maureen Marcano 318 2017-02-26
Y-corporation Daniell Hammack 317 2017-02-27
Bioplex Maureen Marcano 316 2017-02-28
Kinnamplus Boris Faz 315 2017-03-01

DATE_PART

DATE_PART is a function that returns a numeric value for part of a date or datetime value. Like DATE_ADD and DATE_DIFF, DATE_PART takes different units as one of its arguments. The unit types recognized by DATE_PART are “timezone”, “timezonehour”, “timezoneminute”, “year”, “decade”, “century”, “quarter”, “month”, “week”, “day”, “dayofweek”, “dayofyear”, “hour”, “minute”, “second”, and “millisecond”.

Though there are fields for create date and close date in our dataset, there isn’t one for the quarter in which deals were opened or closed. If we wanted to know in which quarters the biggest deals were closed, we could run a query like this:

SELECT account,
       sales_agent,
       DATE_PART("quarter", close_date) AS qtr,
       YEAR(sales_pipeline.close_date),
       close_value
  FROM sales_pipeline
 WHERE sales_pipeline.deal_stage = "Won"
 ORDER BY close_value DESC

The function YEAR which was used in the last query is one of the other datetime functions explicated in the reference section.

The results of which look like:

account sales_agent qtr year close_value
Groovestreet Rosalina Dieter 2 2017 30,288
Goodsilron Markita Hansen 2 2017 29,617
Xx-holding Elease Gluck 4 2017 29,220
Cheers Markita Hansen 3 2017 29,166
Cheers Elease Gluck 4 2017 27,971
Labdrill Elease Gluck 2 2017 27,385
Xx-holding Rosalina Dieter 4 2017 26,186
Zoomit Elease Gluck 1 2017 25,897
Kan-code Rosalina Dieter 3 2017 25,791
Cheers Elease Gluck 2 2017 25,464
Y-corporation Rosalina Dieter 2 2017 25,288
Plexzap Elease Gluck 2 2017 24,949

Exercises (Continued from previous section)

There are two ways to do these exercises. The first is to use the “Try query” links to test your queries without saving them. The second is to create a data.world project and save your queries to it. If you are reading this documentation and completing the exercises as a tutorial, you will need to create your own project to save your work. Details and instructions are in the SQL tutorial which has instructions for setting up your project and links to all the current exercises.

Exercise 32

Write a query against the patients table that returns the patient, birthdate, and deathdate, and the patients age in years as age at death. The results should only include patients who have died, and they should be ordered by the age at death, oldest to youngest.

Try query

Hint

SELECT the columns patient, birthdate, deathdate, and find the age of the person at death using DATE_DIFF on the columns (birthdate, and deathdate, with the value rendered in “years”). Label the results column AS `age at death`. Select the columns FROM the patients table and only return records WHERE the deathdate IS NOT NULL. ORDER your results BY `age at death`.

See solution