ROLLUP and CUBE

An introduction to ROLLUP and CUBE.

ROLLUP and CUBE make it possible for values to be aggregated on several different axes satisfying more detailed reporting needs. ROLLUP summarizes against a hierarchy of columns used in the GROUP BY clause. CUBE groups by all combinations of the values.

`ROLLUP`

If we wanted to see closed deals in the pipeline aggregated by sum for the total, the region, and the sales agent, we would write the query like this:

SELECT teams.regional_office,
       pipeline.sales_agent,
       SUM(pipeline.close_value)
  FROM sales_pipeline AS pipeline, sales_teams AS teams
 WHERE sales_pipeline.sales_agent = sales_teams.sales_agent
       AND sales_pipeline.deal_stage = "Won"
 GROUP BY ROLLUP(teams.regional_office, pipeline.sales_agent)
 ORDER BY teams.regional_office, pipeline.sales_agent

siyeh/sql-crm-example-data Run query Copy code

And the returned data would look like this:

regional_office	sales_agent	sum
		10,005,534
Central		3,346,293
Central	Anna Snelling	275,056
Central	Cecily Lampkin	229,800
Central	Darcel Schlecht	1,153,214
Central	Gladys Colclough	345,674
Central	Jonathan Berthelot	284,886
Central	Lajuana Vencill	194,632
Central	Marty Freudenburg	291,195
Central	Moses Frase	207,182
Central	Niesha Huffines	176,961
Central	Versie Hillebrand	187,693
East		3,090,594
East	Boris Faz	261,631
East	Cassey Cress	450,489
East	Corliss Cosme	421,036
East	Daniell Hammack	364,229
East	Donn Cantrell	445,860
East	Garret Kinder	197,773
East	Reed Clapper	438,336
East	Rosie Papadopoulos	230,169
East	Violet Mclelland	123,431
East	Wilburn Farren	157,640
West		3,568,647
West	Elease Gluck	289,195
West	Hayden Neloms	272,111
West	James Ascencio	413,533
West	Kami Bicknell	316,456
West	Kary Hendrixson	454,298
West	Markita Hansen	328,792
West	Maureen Marcano	350,395
West	Rosalina Dieter	235,403
West	Vicki Laflamme	478,396
West	Zane Levy	430,068

The row with no value in either of the first two columns is the sum of all the sales in all regions. The rows with data in the first column but not the second contain the sum for that region.

`CUBE`

The above example was great for ROLLUP because there was an easy hierarchy with the entire pipeline at the top, then sales divided into regions, then the regions further divided into sales agents. However if we wanted to compare the sales numbers for products and sales agents, there isn’t a logical hierarchy: product is not a subcategory of sales agent, and sales agent is not a subcategory of product. A better result combination then might be to return all possible combinations of of the sums of products and sales agents. We would get those combinations using CUBE:

SELECT product,
       sales_agent,
       SUM(close_value)
  FROM sales_pipeline
 WHERE sales_pipeline.deal_stage = "Won"
 GROUP BY CUBE(product, sales_agent)
 ORDER BY product, sales_agent

siyeh/sql-crm-example-data Run query Copy code

Scroll down the results table to see the format of the combinations.

product	sales_agent	sum
		10005534
	Anna Snelling	275056
	Boris Faz	261631
	Cassey Cress	450489
	Cecily Lampkin	229800
	Corliss Cosme	421036
	Daniell Hammack	364229
	Darcel Schlecht	1153214
	Donn Cantrell	445860
	Elease Gluck	289195
	Garret Kinder	197773
	Gladys Colclough	345674
	Hayden Neloms	272111
	James Ascencio	413533
	Jonathan Berthelot	284886
	Kami Bicknell	316456
	Kary Hendrixson	454298
	Lajuana Vencill	194632
	Markita Hansen	328792
	Marty Freudenburg	291195
	Maureen Marcano	350395
	Moses Frase	207182
	Niesha Huffines	176961
	Reed Clapper	438336
	Rosalina Dieter	235403
	Rosie Papadopoulos	230169
	Versie Hillebrand	187693
	Vicki Laflamme	478396
	Violet Mclelland	123431
	Wilburn Farren	157640
	Zane Levy	430068
GTK 500		400612
GTK 500	Elease Gluck	184632
GTK 500	Markita Hansen	83528
GTK 500	Rosalina Dieter	132452
GTX Basic		499263
GTX Basic	Anna Snelling	17437
GTX Basic	Boris Faz	9041
GTX Basic	Cassey Cress	19699
GTX Basic	Cecily Lampkin	8756
GTX Basic	Corliss Cosme	19970
GTX Basic	Daniell Hammack	6030
GTX Basic	Darcel Schlecht	20124
GTX Basic	Donn Cantrell	24462
GTX Basic	Elease Gluck	7095
GTX Basic	Garret Kinder	8532
GTX Basic	Gladys Colclough	7721
GTX Basic	James Ascencio	17599
GTX Basic	Jonathan Berthelot	39204
GTX Basic	Kami Bicknell	38543
GTX Basic	Kary Hendrixson	44671
GTX Basic	Lajuana Vencill	29903

The data returned from this query shows the sum of all the sales, followed by all sales by sales agent. Then the product info is folded in and the sum of the first product sales is given, followed by the the sum of that product’s sales by sales agent. The product sum followed by product sum by sales agent pattern continues for the rest of the results table.

If one of the columns that you are rolling up contains nulls then you may see ambiguous rows where you cannot tell if the null in that column is a value in a rows, or is due to the rollup itself. In that case you can use the GROUPING aggregation to disambiguate. Add a GROUPING as another column, referencing the column you are grouping by. A one in the GROUPING column means the value is due to rollup. Otherwise a zero will be shown:

query:

 SELECT sales_agent,
        account,
        GROUPING(sales_agent),
        GROUPING(account),
        SUM(order_value)
   FROM apr_2017_orders
  GROUP BY ROLLUP(sales_agent, account)
  ORDER BY sales_agent, account

siyeh/sql-crm-example-data Run query Copy code

result: |sales_agent|account|grouping|grouping_2|sum| |-|-|-|-|-| |sales_agent|account|grouping|grouping_2|sum| |-|-|-|-|-| |Boris Faz|Zencorporation|0|0|560| |Boris Faz|Zotware|0|0|59| |Boris Faz||0|1|18957| |Cassey Cress|Donquadtech|0|0|59| |Cassey Cress|Gekko & Co|0|0|1114| |Cassey Cress|Initech|0|0|3275|

Next up: PIVOT

An introduction to PIVOT.