An aggregation is the result returned from running an aggregate function in a query and is displayed in a new column in the results table. An aggregate function is a function that results of a set of the data queried being condensed into an aggregation.
The most common aggregate functions are
SUM returns the sum of all numeric value in a specified column.
COUNT preceding a column name returns the number of rows for which the value is not
NULL and can be used with both numeric and non-numeric field-types.
COUNT(*) returns the number of rows in the table specified in the
AVG returns the average of all the entries in a column which are not NULL.
AVG can only be used with numeric data.
MIN returns the lowest value in a specified column and can be used with numeric and non-numeric data.
MAX returns the greatest value in a specified column and can be used with both numeric and non-numeric data.
GROUP_CONCAT is the only aggregate function designed to work with strings. It takes string values and aggregates them into one value for each row with an optional separator.
If we wanted to know the total value of all the deals closed we would write:
SELECT SUM(close_value) FROM sales_pipeline
and the resulting table would be:
Similarly, if we wanted to see the number of deals we had won in our sales pipeline we could write a query like this:
SELECT COUNT(*) FROM sales_pipeline WHERE deal_stage = "Won"
and the result returned would look like:
Likewise we could show the average deal won:
SELECT AVG(close_value) FROM sales_pipeline WHERE sales_pipeline.deal_stage = "Won"
close_valueof lost deals. If we run an average on the
close_valuecolumn without specifying we want to only include the deals won we’ll get an average of all the completed deals regardless of whether they were won and lost. While an average of all finished deals is potentially interesting data, it isn’t the same as an average value of the actual sales.
For the largest deal we would query:
SELECT MAX(close_value) FROM sales_pipeline
And for the smallest deal:
SELECT MIN(close_value) FROM sales_pipeline WHERE sales_pipeline.deal_stage = "Won"
AVG, we had to include the qualification to only use data from deals that were won because otherwise the smallest deal size would have been ‘0’–the value for lost sales.
Until we look at the
GROUP BY clause in the next section, the only use we have for
GROUP_CONCAT at this point is to get a list of the values from one column in a table and to return it as a string. For example, if we wanted a list of all of the managers we could run the query:
SELECT GROUP_CONCAT(manager) Managers FROM sales_teams
The result would be:
|Rocco Neubert Celia Rouche Summer Sewald Summer Sewald Celia Rouche Melvin Marxen Melvin Marxen Summer Sewald Dustin Brinkmann Melvin Marxen Melvin Marxen Rocco Neubert Summer Sewald Melvin Marxen Rocco Neubert Cara Losch Celia Rouche Cara Losch Celia Rouche Rocco Neubert Dustin Brinkmann Cara Losch Rocco Neubert Cara Losch Dustin Brinkmann Cara Losch Summer Sewald Celia Rouche Dustin Brinkmann Dustin Brinkmann|
There are two problems with that result. First, it is not a list of the managers, it’s a list of all the instances of manager (one for each sale agent in the table). It we want to return each manager’s name only once we need to use the
DISTINCT modifier. Second, the names all run together, and it’s likely that if we wanted a list we’d want the entries separated. Fortunately,
GROUP_CONCAT has the modifier
SEPARATOR which inserts the specified string between each value returned by
The amended search would look like:
SELECT GROUP_CONCAT(DISTINCT manager SEPARATOR ", ") AS Managers FROM sales_teams
And the result would be:
|Rocco Neubert, Celia Rouche, Summer Sewald, Melvin Marxen, Dustin Brinkmann, Cara Losch|
ASwas used in the preceding query because without it the column name used in the table would have been undefined.
An introduction to the
GROUP BY clause and