ROLLUP and CUBE

An introduction to ROLLUP and CUBE.

ROLLUP and CUBE make it possible for values to be aggregated on several different axes satisfying more detailed reporting needs. ROLLUP summarizes against a hierarchy of columns used in the GROUP BY clause. CUBE groups by all combinations of the values.

ROLLUP

If we wanted to see closed deals in the pipeline aggregated by sum for the total, the region, and the sales agent, we would write the query like this:

SELECT teams.regional_office,
       pipeline.sales_agent,
       SUM(pipeline.close_value)
  FROM sales_pipeline AS pipeline, sales_teams AS teams
 WHERE sales_pipeline.sales_agent = sales_teams.sales_agent
       AND sales_pipeline.deal_stage = "Won"
 GROUP BY ROLLUP(teams.regional_office, pipeline.sales_agent)
 ORDER BY teams.regional_office, pipeline.sales_agent

And the returned data would look like this:

regional_office sales_agent sum
10,005,534
Central 3,346,293
Central Anna Snelling 275,056
Central Cecily Lampkin 229,800
Central Darcel Schlecht 1,153,214
Central Gladys Colclough 345,674
Central Jonathan Berthelot 284,886
Central Lajuana Vencill 194,632
Central Marty Freudenburg 291,195
Central Moses Frase 207,182
Central Niesha Huffines 176,961
Central Versie Hillebrand 187,693
East 3,090,594
East Boris Faz 261,631
East Cassey Cress 450,489
East Corliss Cosme 421,036
East Daniell Hammack 364,229
East Donn Cantrell 445,860
East Garret Kinder 197,773
East Reed Clapper 438,336
East Rosie Papadopoulos 230,169
East Violet Mclelland 123,431
East Wilburn Farren 157,640
West 3,568,647
West Elease Gluck 289,195
West Hayden Neloms 272,111
West James Ascencio 413,533
West Kami Bicknell 316,456
West Kary Hendrixson 454,298
West Markita Hansen 328,792
West Maureen Marcano 350,395
West Rosalina Dieter 235,403
West Vicki Laflamme 478,396
West Zane Levy 430,068

The row with no value in either of the first two columns is the sum of all the sales in all regions. The rows with data in the first column but not the second contain the sum for that region.

CUBE

The above example was great for ROLLUP because there was an easy hierarchy with the entire pipeline at the top, then sales divided into regions, then the regions further divided into sales agents. However if we wanted to compare the sales numbers for products and sales agents, there isn’t a logical hierarchy: product is not a subcategory of sales agent, and sales agent is not a subcategory of product. A better result combination then might be to return all possible combinations of of the sums of products and sales agents. We would get those combinations using CUBE:

SELECT product,
       sales_agent,
       SUM(close_value)
  FROM sales_pipeline
 WHERE sales_pipeline.deal_stage = "Won"
 GROUP BY CUBE(product, sales_agent)
 ORDER BY product, sales_agent

Scroll down the results table to see the format of the combinations.

product sales_agent sum
10005534
Anna Snelling 275056
Boris Faz 261631
Cassey Cress 450489
Cecily Lampkin 229800
Corliss Cosme 421036
Daniell Hammack 364229
Darcel Schlecht 1153214
Donn Cantrell 445860
Elease Gluck 289195
Garret Kinder 197773
Gladys Colclough 345674
Hayden Neloms 272111
James Ascencio 413533
Jonathan Berthelot 284886
Kami Bicknell 316456
Kary Hendrixson 454298
Lajuana Vencill 194632
Markita Hansen 328792
Marty Freudenburg 291195
Maureen Marcano 350395
Moses Frase 207182
Niesha Huffines 176961
Reed Clapper 438336
Rosalina Dieter 235403
Rosie Papadopoulos 230169
Versie Hillebrand 187693
Vicki Laflamme 478396
Violet Mclelland 123431
Wilburn Farren 157640
Zane Levy 430068
GTK 500 400612
GTK 500 Elease Gluck 184632
GTK 500 Markita Hansen 83528
GTK 500 Rosalina Dieter 132452
GTX Basic 499263
GTX Basic Anna Snelling 17437
GTX Basic Boris Faz 9041
GTX Basic Cassey Cress 19699
GTX Basic Cecily Lampkin 8756
GTX Basic Corliss Cosme 19970
GTX Basic Daniell Hammack 6030
GTX Basic Darcel Schlecht 20124
GTX Basic Donn Cantrell 24462
GTX Basic Elease Gluck 7095
GTX Basic Garret Kinder 8532
GTX Basic Gladys Colclough 7721
GTX Basic James Ascencio 17599
GTX Basic Jonathan Berthelot 39204
GTX Basic Kami Bicknell 38543
GTX Basic Kary Hendrixson 44671
GTX Basic Lajuana Vencill 29903

The data returned from this query shows the sum of all the sales, followed by all sales by sales agent. Then the product info is folded in and the sum of the first product sales is given, followed by the the sum of that product’s sales by sales agent. The product sum followed by product sum by sales agent pattern continues for the rest of the results table.

If one of the columns that you are rolling up contains nulls then you may see ambiguous rows where you cannot tell if the null in that column is a value in a rows, or is due to the rollup itself. In that case you can use the GROUPING aggregation to disambiguate. Add a GROUPING as another column, referencing the column you are grouping by. A one in the GROUPING column means the value is due to rollup. Otherwise a zero will be shown:

query:

 SELECT sales_agent,
        account,
        GROUPING(sales_agent),
        GROUPING(account),
        SUM(order_value)
   FROM apr_2017_orders
  GROUP BY ROLLUP(sales_agent, account)
  ORDER BY sales_agent, account

result: |sales_agent|account|grouping|grouping_2|sum| |-|-|-|-|-| |sales_agent|account|grouping|grouping_2|sum| |-|-|-|-|-| |Boris Faz|Zencorporation|0|0|560| |Boris Faz|Zotware|0|0|59| |Boris Faz||0|1|18957| |Cassey Cress|Donquadtech|0|0|59| |Cassey Cress|Gekko & Co|0|0|1114| |Cassey Cress|Initech|0|0|3275|