GROUPING SETS,ROLLUP and CUBE

If we need to group the same data at multiple granularities (levels), Let’s say at day level, month level and year level all we will be doing is create the same script multiple times with day columns removed at month level script, day and month columns removed at year level script and execute them separately leading to multiple scripts hitting the same tables in our database, alternatively we can use Grouping Sets, Rollup and Cubes to achieve the same with a single script and reducing the number of hits to the server.

 

Grouping Sets

 

SQL Server and Oracle:

 

Syntax:

            Select <<Grouping columns>>

, <<Aggregated Columns>>

,Grouping_ID (<<Grouping columns>>)

, GROUPING(<<Grouping columns>>)

            From <<Tables>>

            Group by Grouping Sets (

                                                            (<<Grouping columns set1>>)

                                                            , (<<Grouping columns set2>>)

                                                            …        

                                                     )

 

 

Example:

For simplicity of understanding I am using date column, year column, month column and order quantity is the aggregation column (SUM)

 

As shown below for day level we are including all the columns, month level we are excluding day column and year level we are

 

GROUPING

 

GROUPING Indicates whether a specified column expression in a GROUP BY list is aggregated or not. GROUPING returns 1 if not part of grouping or aggregated and 0 if it is part of grouping or not aggregated in the result set.

So for below example for year level i.e. col_date and col_month are not part of grouping and col_year is part of grouping and the values are GROUPING(col_date) =1 , GROUPING(col_month) =1 and GROUPING(col_year) =0

 

 

 GROUPING_ID

            Grouping_ID indicates the integer equivalent of binary value of the list of columns, so the binary value of Year Level i.e is 110 as GROUPING(col_date) =1 and GROUPING(col_month) =1 and GROUPING(col_year) =0 and the integer equivalent of 110 is 6(4+2+0)

 

If the order of columns in grouping_id list is changed the output will also change for example for the same year level if we have gropuing columns in opposite order i.e. GROUPING_ID(col_year,col_month, col_date)  will be 011 so the integer equivalent is 3 (0+2+1)

 

Script:

;WITH TABLE1 AS  -- remove semicolon for oracle

(

SELECT

'20140101' AS COL_DATE

,'2014' AS COL_YEAR

,'01' AS COL_MONTH

, 10 AS ORDER_QUANTITY

-- for oracle add "from dual"

UNION ALL

SELECT

'20140102' AS COL_DATE

,'2014' AS COL_YEAR

,'01' AS COL_MONTH

, 11 AS ORDER_QUANTITY

-- for oracle add "from dual"

UNION ALL

SELECT

'20140105' AS COL_DATE

,'2014' AS COL_YEAR

,'01' AS COL_MONTH

, 12 AS ORDER_QUANTITY

 -- for oracle add "from dual"

UNION ALL

SELECT

'20140201' AS COL_DATE

,'2014' AS COL_YEAR

,'02' AS COL_MONTH

, 13 AS ORDER_QUANTITY

 -- for oracle add "from dual"

UNION ALL

SELECT

'20140204' AS COL_DATE

,'2014' AS COL_YEAR

,'02' AS COL_MONTH

, 14 AS ORDER_QUANTITY

-- for oracle add "from dual"

)

 

SELECT

SUM(ORDER_QUANTITY) AS ORDER_QUANTITY

,COL_DATE

,COL_MONTH

,COL_YEAR

,CASE

WHEN GROUPING(COL_DATE) =1

AND GROUPING(COL_MONTH) =1

AND GROUPING(COL_YEAR) =0

        THEN 'YEAR_LEVEL'

WHEN GROUPING(COL_DATE) =1

AND GROUPING(COL_MONTH) =0

AND GROUPING(COL_YEAR) =0

        THEN 'MONTH_LEVEL'

WHEN  GROUPING(COL_DATE) =0

AND GROUPING(COL_MONTH) =0

AND GROUPING(COL_YEAR) =0

         THEN  'DAY_LEVEL'

 END AS GROUPING_DATA_LEVEL

,CASE

WHEN GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=6

       THEN 'YEAR_LEVEL'

            WHEN GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=4

       THEN 'MONTH_LEVEL'

             WHEN GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=0

        THEN  'DAY_LEVEL'

 END AS GROUPING_ID_DATA_LEVEL

FROM TABLE1

GROUP BY GROUPING SETS (

(

COL_DATE

,COL_MONTH

,COL_YEAR

),

(

COL_MONTH

,COL_YEAR

),

(

COL_YEAR

)

       )

 

ROLLUP:

 

If all we are trying to do is getting all levels of data i.e day level, month level and year level then we can use rollup, all rollup does is if we specify ROLLUP (a,b,c) it will output records with group by (a, b, c),group by (a, b), group by (a) and a grand total rows. In our case col_year is a, col_month is b and col_date is c.

 

So we will surely have attributes or columns specific to each level what should we do? simple we have to put the same level columns inside brackets i.e. let’s say we have Day_of_Year column which has date and is day level then our group by will be

GROUP BY ROLLUP (COL_YEAR,COL_MONTH,( COL_DATE, Day_of_Year) )

 

Script:

;WITH TABLE1 AS  -- remove semicolon for oracle

(

SELECT

'20140101' AS COL_DATE

,'2014' AS COL_YEAR

,'01' AS COL_MONTH

, 10 AS ORDER_QUANTITY

-- for oracle add "from dual"

UNION ALL

SELECT

'20140102' AS COL_DATE

,'2014' AS COL_YEAR

,'01' AS COL_MONTH

, 11 AS ORDER_QUANTITY

-- for oracle add "from dual"

UNION ALL

SELECT

'20140105' AS COL_DATE

,'2014' AS COL_YEAR

,'01' AS COL_MONTH

, 12 AS ORDER_QUANTITY

 -- for oracle add "from dual"

UNION ALL

SELECT

'20140201' AS COL_DATE

,'2014' AS COL_YEAR

,'02' AS COL_MONTH

, 13 AS ORDER_QUANTITY

 -- for oracle add "from dual"

UNION ALL

SELECT

'20140204' AS COL_DATE

,'2014' AS COL_YEAR

,'02' AS COL_MONTH

, 14 AS ORDER_QUANTITY

-- for oracle add "from dual"

)

 

SELECT

SUM(ORDER_QUANTITY)

,COL_DATE

,COL_MONTH

,COL_YEAR

,CASE

WHEN GROUPING(COL_DATE) =1

AND GROUPING(COL_MONTH) =1

AND GROUPING(COL_YEAR) =0

        THEN 'YEAR_LEVEL'

WHEN GROUPING(COL_DATE) =1

AND GROUPING(COL_MONTH) =0

AND GROUPING(COL_YEAR) =0

        THEN 'MONTH_LEVEL'

WHEN  GROUPING(COL_DATE) =0

AND GROUPING(COL_MONTH) =0

AND GROUPING(COL_YEAR) =0

         THEN  'DAY_LEVEL'

 END AS GROUPING_DATA_LEVEL

,CASE

WHEN GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=6

       THEN 'YEAR_LEVEL'

            WHEN GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=4

       THEN 'MONTH_LEVEL'

             WHEN GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=0

        THEN  'DAY_LEVEL'

 END AS GROUPING_ID_DATA_LEVEL

FROM TABLE1

GROUP BY ROLLUP (

COL_YEAR

,COL_MONTH

, COL_DATE    

                                           )

 

CUBE:

            Cube grouping outputs all the combination of groupings for the group by columns, for example if we have CUBE (a,b,c) it will output records with group by (a, b, c), (a, b), (a, c), (b, c), (a), (b),(c) and a grand total rows.

 

Script:

;WITH TABLE1 AS  -- remove semicolon for oracle

(

SELECT

'20140101' AS COL_DATE

,'2014' AS COL_YEAR

,'01' AS COL_MONTH

, 10 AS ORDER_QUANTITY

-- for oracle add "from dual"

UNION ALL

SELECT

'20140102' AS COL_DATE

,'2014' AS COL_YEAR

,'01' AS COL_MONTH

, 11 AS ORDER_QUANTITY

-- for oracle add "from dual"

UNION ALL

SELECT

'20140105' AS COL_DATE

,'2014' AS COL_YEAR

,'01' AS COL_MONTH

, 12 AS ORDER_QUANTITY

 -- for oracle add "from dual"

UNION ALL

SELECT

'20140201' AS COL_DATE

,'2014' AS COL_YEAR

,'02' AS COL_MONTH

, 13 AS ORDER_QUANTITY

 -- for oracle add "from dual"

UNION ALL

SELECT

'20140204' AS COL_DATE

,'2014' AS COL_YEAR

,'02' AS COL_MONTH

, 14 AS ORDER_QUANTITY

-- for oracle add "from dual"

)

 

SELECT

SUM(ORDER_QUANTITY)

,COL_DATE

,COL_MONTH

,COL_YEAR

, GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)

FROM TABLE1

GROUP BY CUBE (

COL_YEAR

,COL_MONTH

, COL_DATE    

                                           )

ORDER BY GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)

 

Thank you for reading this article, check out my other SQL Server vs. Oracle posts.