Aggregations
An introduction to aggregations, aggregation functions, and aggregation modifiers.
An aggregation is the result returned from running an aggregate function in a query and is displayed in a new column in the results table. An aggregate function is a function that results of a set of the data queried being condensed into an aggregation.
The most common aggregate functions are SUM
, COUNT
, AVG
, MIN
, MAX
, and GROUP_CONCAT
. SUM
returns the sum of all numeric value in a specified column. COUNT
preceding a column name returns the number of rows for which the value is not NULL
and can be used with both numeric and non-numeric field-types. COUNT(*)
returns the number of rows in the table specified in the FROM
clause. AVG
returns the average of all the entries in a column which are not NULL. AVG
can only be used with numeric data. MIN
returns the lowest value in a specified column and can be used with numeric and non-numeric data. MAX
returns the greatest value in a specified column and can be used with both numeric and non-numeric data. GROUP_CONCAT
is the only aggregate function designed to work with strings. It takes string values and aggregates them into one value for each row with an optional separator.
SUM
If we wanted to know the total value of all the deals closed we would write:
SELECT SUM(close_value)
FROM sales_pipeline
and the resulting table would be:
sum |
---|
10,005,534 |
COUNT
Similarly, if we wanted to see the number of deals we had won in our sales pipeline we could write a query like this:
SELECT COUNT(*)
FROM sales_pipeline
WHERE deal_stage = "Won"
and the result returned would look like:
count |
---|
4,238 |
AVG
Likewise we could show the average deal won:
SELECT AVG(close_value)
FROM sales_pipeline
WHERE sales_pipeline.deal_stage = "Won"
avg |
---|
2,360.91 |
close_value
of lost deals. If we run an average on the close_value
column without specifying we want to only include the deals won we’ll get an average of all the completed deals regardless of whether they were won and lost. While an average of all finished deals is potentially interesting data, it isn’t the same as an average value of the actual sales.MAX
and MIN
For the largest deal we would query:
SELECT MAX(close_value)
FROM sales_pipeline
max |
---|
30,288 |
And for the smallest deal:
SELECT MIN(close_value)
FROM sales_pipeline
WHERE sales_pipeline.deal_stage = "Won"
min |
---|
38 |
AVG
, we had to include the qualification to only use data from deals that were won because otherwise the smallest deal size would have been ‘0’–the value for lost sales.GROUP CONCAT
Until we look at the GROUP BY
clause in the next section, the only use we have for GROUP_CONCAT
at this point is to get a list of the values from one column in a table and to return it as a string. For example, if we wanted a list of all of the managers we could run the query:
SELECT GROUP_CONCAT(manager) Managers
FROM sales_teams
The result would be:
Managers |
---|
Rocco Neubert Celia Rouche Summer Sewald Summer Sewald Celia Rouche Melvin Marxen Melvin Marxen Summer Sewald Dustin Brinkmann Melvin Marxen Melvin Marxen Rocco Neubert Summer Sewald Melvin Marxen Rocco Neubert Cara Losch Celia Rouche Cara Losch Celia Rouche Rocco Neubert Dustin Brinkmann Cara Losch Rocco Neubert Cara Losch Dustin Brinkmann Cara Losch Summer Sewald Celia Rouche Dustin Brinkmann Dustin Brinkmann |
There are two problems with that result. First, it is not a list of the managers, it’s a list of all the instances of manager (one for each sale agent in the table). It we want to return each manager’s name only once we need to use the DISTINCT
modifier. Second, the names all run together, and it’s likely that if we wanted a list we’d want the entries separated. Fortunately, GROUP_CONCAT
has the modifier SEPARATOR
which inserts the specified string between each value returned by GROUP_CONCAT
.
The amended search would look like:
SELECT GROUP_CONCAT(DISTINCT manager SEPARATOR ", ") AS Managers
FROM sales_teams
And the result would be:
Managers |
---|
Rocco Neubert, Celia Rouche, Summer Sewald, Melvin Marxen, Dustin Brinkmann, Cara Losch |
AS
was used in the preceding query because without it the column name used in the table would have been undefined.An introduction to the GROUP BY
clause and FILTER
modifier.