If we need to group the same data at
multiple granularities (levels), Let’s say at day level, month level and year
level all we will be doing is create the same script multiple times with day
columns removed at month level script, day and month columns removed at year
level script and execute them separately leading to multiple scripts hitting
the same tables in our database, alternatively we can use Grouping Sets, Rollup
and Cubes to achieve the same with a single script and reducing the number of
hits to the server.
Grouping Sets
SQL Server and Oracle:
Syntax:
Select
<<Grouping columns>>
, <<Aggregated
Columns>>
,Grouping_ID
(<<Grouping columns>>)
, GROUPING(<<Grouping columns>>)
From
<<Tables>>
Group
by Grouping Sets (
(<<Grouping
columns set1>>)
,
(<<Grouping columns set2>>)
…
)
Example:
For simplicity of understanding I am
using date column, year column, month column and order quantity is the
aggregation column (SUM)
As
shown below for day level we are including all the columns, month level we are
excluding day column and year level we are
GROUPING
GROUPING Indicates whether a specified
column expression in a GROUP BY list is aggregated or not. GROUPING returns 1
if not part of grouping or aggregated and 0 if it is part of grouping or not
aggregated in the result set.
So
for below example for year level i.e. col_date and col_month are not part of
grouping and col_year is part of grouping and the values are GROUPING(col_date) =1 ,
GROUPING(col_month) =1 and GROUPING(col_year) =0
GROUPING_ID
Grouping_ID
indicates the integer equivalent of binary value of the list of columns, so the
binary value of Year Level i.e is 110 as GROUPING(col_date) =1 and
GROUPING(col_month) =1 and GROUPING(col_year) =0 and the integer equivalent of
110 is 6(4+2+0)
If the order of columns in
grouping_id list is changed the output will also change for example for the
same year level if we have gropuing columns in opposite order i.e. GROUPING_ID(col_year,col_month,
col_date) will be 011 so the integer equivalent is 3 (0+2+1)
Script:
;WITH TABLE1 AS --
remove semicolon for oracle
(
SELECT
'20140101'
AS COL_DATE
,'2014' AS
COL_YEAR
,'01' AS COL_MONTH
, 10 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
UNION ALL
SELECT
'20140102'
AS COL_DATE
,'2014' AS
COL_YEAR
,'01' AS
COL_MONTH
, 11 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
UNION ALL
SELECT
'20140105'
AS COL_DATE
,'2014' AS
COL_YEAR
,'01' AS
COL_MONTH
, 12 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
UNION ALL
SELECT
'20140201'
AS COL_DATE
,'2014' AS
COL_YEAR
,'02' AS
COL_MONTH
, 13 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
UNION ALL
SELECT
'20140204'
AS COL_DATE
,'2014' AS
COL_YEAR
,'02' AS
COL_MONTH
, 14 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
)
SELECT
SUM(ORDER_QUANTITY) AS
ORDER_QUANTITY
,COL_DATE
,COL_MONTH
,COL_YEAR
,CASE
WHEN
GROUPING(COL_DATE) =1
AND
GROUPING(COL_MONTH) =1
AND
GROUPING(COL_YEAR) =0
THEN
'YEAR_LEVEL'
WHEN
GROUPING(COL_DATE) =1
AND
GROUPING(COL_MONTH) =0
AND
GROUPING(COL_YEAR) =0
THEN
'MONTH_LEVEL'
WHEN
GROUPING(COL_DATE) =0
AND
GROUPING(COL_MONTH) =0
AND
GROUPING(COL_YEAR) =0
THEN
'DAY_LEVEL'
END AS
GROUPING_DATA_LEVEL
,CASE
WHEN
GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=6
THEN
'YEAR_LEVEL'
WHEN
GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=4
THEN
'MONTH_LEVEL'
WHEN
GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=0
THEN 'DAY_LEVEL'
END AS
GROUPING_ID_DATA_LEVEL
FROM TABLE1
GROUP BY GROUPING SETS
(
(
COL_DATE
,COL_MONTH
,COL_YEAR
),
(
COL_MONTH
,COL_YEAR
),
(
COL_YEAR
)
)
ROLLUP:
If all we are trying to do is
getting all levels of data i.e day level, month level and year level then we
can use rollup, all rollup does is if we specify ROLLUP (a,b,c) it will output records
with group by (a, b, c),group by (a, b), group by (a) and a grand total rows. In
our case col_year is a, col_month is b and col_date is c.
So we will surely have attributes or
columns specific to each level what should we do? simple we have to put the same
level columns inside brackets i.e. let’s say we have Day_of_Year column which
has date and is day level then our group by will be
GROUP BY ROLLUP
(COL_YEAR,COL_MONTH,( COL_DATE, Day_of_Year) )
Script:
;WITH TABLE1 AS --
remove semicolon for oracle
(
SELECT
'20140101'
AS COL_DATE
,'2014' AS
COL_YEAR
,'01' AS
COL_MONTH
, 10 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
UNION ALL
SELECT
'20140102'
AS COL_DATE
,'2014' AS
COL_YEAR
,'01' AS
COL_MONTH
, 11 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
UNION ALL
SELECT
'20140105'
AS COL_DATE
,'2014' AS
COL_YEAR
,'01' AS
COL_MONTH
, 12 AS ORDER_QUANTITY
--
for oracle add "from dual"
UNION ALL
SELECT
'20140201'
AS COL_DATE
,'2014' AS
COL_YEAR
,'02' AS
COL_MONTH
, 13 AS
ORDER_QUANTITY
--
for oracle add "from dual"
UNION ALL
SELECT
'20140204'
AS COL_DATE
,'2014' AS
COL_YEAR
,'02' AS COL_MONTH
, 14 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
)
SELECT
SUM(ORDER_QUANTITY)
,COL_DATE
,COL_MONTH
,COL_YEAR
,CASE
WHEN
GROUPING(COL_DATE) =1
AND
GROUPING(COL_MONTH) =1
AND
GROUPING(COL_YEAR) =0
THEN
'YEAR_LEVEL'
WHEN GROUPING(COL_DATE)
=1
AND
GROUPING(COL_MONTH) =0
AND
GROUPING(COL_YEAR) =0
THEN
'MONTH_LEVEL'
WHEN
GROUPING(COL_DATE) =0
AND
GROUPING(COL_MONTH) =0
AND
GROUPING(COL_YEAR) =0
THEN
'DAY_LEVEL'
END AS
GROUPING_DATA_LEVEL
,CASE
WHEN
GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=6
THEN
'YEAR_LEVEL'
WHEN
GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=4
THEN
'MONTH_LEVEL'
WHEN
GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)=0
THEN
'DAY_LEVEL'
END AS
GROUPING_ID_DATA_LEVEL
FROM TABLE1
GROUP BY ROLLUP (
COL_YEAR
,COL_MONTH
, COL_DATE
)
CUBE:
Cube
grouping outputs all the combination of groupings for the group by columns, for
example if we have CUBE (a,b,c) it will output records with group by (a, b, c),
(a, b), (a, c), (b, c), (a), (b),(c) and a grand total rows.
Script:
;WITH TABLE1 AS --
remove semicolon for oracle
(
SELECT
'20140101'
AS COL_DATE
,'2014' AS
COL_YEAR
,'01' AS
COL_MONTH
, 10 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
UNION ALL
SELECT
'20140102'
AS COL_DATE
,'2014' AS
COL_YEAR
,'01' AS
COL_MONTH
, 11 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
UNION ALL
SELECT
'20140105'
AS COL_DATE
,'2014' AS
COL_YEAR
,'01' AS
COL_MONTH
, 12 AS
ORDER_QUANTITY
--
for oracle add "from dual"
UNION ALL
SELECT
'20140201'
AS COL_DATE
,'2014' AS
COL_YEAR
,'02' AS
COL_MONTH
, 13 AS
ORDER_QUANTITY
--
for oracle add "from dual"
UNION ALL
SELECT
'20140204'
AS COL_DATE
,'2014' AS
COL_YEAR
,'02' AS
COL_MONTH
, 14 AS
ORDER_QUANTITY
-- for
oracle add "from dual"
)
SELECT
SUM(ORDER_QUANTITY)
,COL_DATE
,COL_MONTH
,COL_YEAR
,
GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)
FROM TABLE1
GROUP BY CUBE (
COL_YEAR
,COL_MONTH
, COL_DATE
)
ORDER BY
GROUPING_ID(COL_DATE,COL_MONTH,COL_YEAR)
Thank you for
reading this article, check out my other SQL Server vs. Oracle posts.