Aggregations

An introduction to aggregations, aggregation functions, and aggregation modifiers.

An aggregation is the result returned from running an aggregate function in a query and is displayed in a new column in the results table. An aggregate function is a function that results of a set of the data queried being condensed into an aggregation.

The most common aggregate functions are SUM, COUNT, AVG, MIN, MAX, and GROUP_CONCAT. SUM returns the sum of all numeric value in a specified column. COUNT preceding a column name returns the number of rows for which the value is not NULL and can be used with both numeric and non-numeric field-types. COUNT(*) returns the number of rows in the table specified in the FROM clause. AVG returns the average of all the entries in a column which are not NULL. AVG can only be used with numeric data. MIN returns the lowest value in a specified column and can be used with numeric and non-numeric data. MAX returns the greatest value in a specified column and can be used with both numeric and non-numeric data. GROUP_CONCAT is the only aggregate function designed to work with strings. It takes string values and aggregates them into one value for each row with an optional separator.

SUM

If we wanted to know the total value of all the deals closed we would write:

SELECT SUM(close_value)
  FROM sales_pipeline

and the resulting table would be:

sum
10,005,534

COUNT

Similarly, if we wanted to see the number of deals we had won in our sales pipeline we could write a query like this:

SELECT COUNT(*)
  FROM sales_pipeline
 WHERE deal_stage = "Won"

and the result returned would look like:

count
4,238

AVG

Likewise we could show the average deal won:

SELECT AVG(close_value)
  FROM sales_pipeline
 WHERE sales_pipeline.deal_stage = "Won"
avg
2,360.91
Our pipeline data has a stored value of ‘0’ instead of a blank field for close_value of lost deals. If we run an average on the close_value column without specifying we want to only include the deals won we’ll get an average of all the completed deals regardless of whether they were won and lost. While an average of all finished deals is potentially interesting data, it isn’t the same as an average value of the actual sales.

MAXand MIN

For the largest deal we would query:

SELECT MAX(close_value)
  FROM sales_pipeline
max
30,288

And for the smallest deal:

SELECT MIN(close_value)
  FROM sales_pipeline
 WHERE sales_pipeline.deal_stage = "Won"
min
38
As for AVG, we had to include the qualification to only use data from deals that were won because otherwise the smallest deal size would have been ‘0’–the value for lost sales.

GROUP CONCAT

Until we look at the GROUP BY clause in the next section, the only use we have for GROUP_CONCAT at this point is to get a list of the values from one column in a table and to return it as a string. For example, if we wanted a list of all of the managers we could run the query:

SELECT GROUP_CONCAT(manager) Managers
  FROM sales_teams

The result would be:

Managers
Rocco Neubert Celia Rouche Summer Sewald Summer Sewald Celia Rouche Melvin Marxen Melvin Marxen Summer Sewald Dustin Brinkmann Melvin Marxen Melvin Marxen Rocco Neubert Summer Sewald Melvin Marxen Rocco Neubert Cara Losch Celia Rouche Cara Losch Celia Rouche Rocco Neubert Dustin Brinkmann Cara Losch Rocco Neubert Cara Losch Dustin Brinkmann Cara Losch Summer Sewald Celia Rouche Dustin Brinkmann Dustin Brinkmann

There are two problems with that result. First, it is not a list of the managers, it’s a list of all the instances of manager (one for each sale agent in the table). It we want to return each manager’s name only once we need to use the DISTINCT modifier. Second, the names all run together, and it’s likely that if we wanted a list we’d want the entries separated. Fortunately, GROUP_CONCAT has the modifier SEPARATOR which inserts the specified string between each value returned by GROUP_CONCAT.

The amended search would look like:

SELECT GROUP_CONCAT(DISTINCT manager SEPARATOR ", ") AS Managers
  FROM sales_teams

And the result would be:

Managers
Rocco Neubert, Celia Rouche, Summer Sewald, Melvin Marxen, Dustin Brinkmann, Cara Losch
The keyword ASwas used in the preceding query because without it the column name used in the table would have been undefined.