HAVING
An introduction to the HAVING clause.
HAVING is used with aggregations to filter out results returned by the
aggregation. It is similar to WHERE except that WHERE removes values before
the aggregation function is applied to the values, and HAVING removes values
after aggregation has occurred.
In the last section we ran a query to count the number of deals won and group
them by sales agent. But what if we only wanted to see the counts for those
sales agents who had closed over 200 deals? We would have to run the
aggregation first to get the counts for each, and then we would filter the
aggregations using HAVING to see all of the sales agents who had more than
200 deals closed. The query would look like this:
SELECT sales_agent,
       COUNT(sales_pipeline.close_value) AS `number won`
  FROM sales_pipeline
 WHERE sales_pipeline.deal_stage = "Won"
 GROUP BY sales_pipeline.sales_agent
HAVING COUNT(sales_pipeline.close_value) > 200The results table would look like this:
| sales_agent | number won | 
|---|---|
| Vicki Laflamme | 221 | 
| Anna Snelling | 208 | 
| Kary Hendrixson | 209 | 
| Darcel Schlecht | 349 | 
HAVING, GROUP BY, and FILTER can all be used in the same query to focus
on results returned from very specific queries. For example, if we wanted to
see a table with counts of deals won, counts of deals won where the value was
over 1000, average value of all won deals, and average value of all deals won
whose value was over 1000, and we only wanted to see this data for sales agents
who had won more than 100 deals with a value of over 1000, we could write
a query like this:
SELECT sales_agent,
       COUNT(sales_pipeline.close_value) AS `number won`,
       COUNT(sales_pipeline.close_value)
FILTER(WHERE sales_pipeline.close_value > 1000) AS `number won > 1000`,
       AVG(sales_pipeline.close_value) AS `average of all`,
       AVG(sales_pipeline.close_value)
FILTER(WHERE sales_pipeline.close_value > 1000) AS `avg > 1000`
  FROM sales_pipeline
 WHERE sales_pipeline.deal_stage = "Won"
 GROUP BY sales_pipeline.sales_agent
HAVING COUNT(sales_pipeline.close_value)
FILTER(WHERE sales_pipeline.close_value > 1000) > 100We would get a very small set of data back that looks like this:
| sales_agent | number won | number won > 1000 | average of all | avg > 1000 | 
|---|---|---|---|---|
| Darcel Schlecht | 349 | 272 | 3304.3381088825213 | 4115.036764705882 | 
| Reed Clapper | 155 | 110 | 2827.974193548387 | 3808.981818181818 | 
| Donn Cantrell | 158 | 106 | 2821.8987341772154 | 3938.6132075471696 | 
| Vicki Laflamme | 221 | 111 | 2164.68778280543 | 3985.108108108108 | 
| Cassey Cress | 163 | 112 | 2763.7361963190183 | 3792.4821428571427 | 
Exercises (Continued from previous section)
There are two ways to do these exercises. The first is to use the “Try query” links to test your queries without saving them. The second is to create a data.world project and save your queries to it. If you are reading this documentation and completing the exercises as a tutorial, you will need to create your own project to save your work. Details and instructions are in the SQL tutorial which has instructions for setting up your project and links to all the current exercises.
Exercise 29
For this exercise you’ll use your query result from exercise 28 and modify it so that only patients with an average BMI over 30 are included. Order the results by the maximum body mass index highest to lowest.
Hint
SELECT the column patient, the AVG of the (value) column relabeled AS `AVG BMI` , a COUNT of the (value) column relabeled AS `Number of Readings` , and the MAX of the (value) column. FILTER your query so that it only returns results (WHERE the value column is > 30). Label the filtered results from the value column AS `Max Obese BMI`. All of these columns should be selected FROM the observations_cleaned table. Restrict your results to only include records WHERE the description column = “Body Mass Index”. GROUP your results BY the patient column. Also restrict your results to  those HAVING an AVG of the (value) column which is > 30. ORDER your results BY the MAX of the (value) column in DESCending order.
An introduction to the UNION, INTERSECT, and MINUS clauses.