db2 Olap
db2 Olap
1.1
Country
Sex
Australia
Denmark
Germany
Netherlands
United States
Total
Male
9,913, 658
2,676, 377
40,413, 132
8,079, 392
143,957, 558
205,040, 117
Female
9,999, 486
2,737, 015
42,011, 477
8,238, 807
149,070, 013
212,056, 798
Total
19,913, 144
5,413, 392
82,424, 609
16,318, 199
293,027, 571
417,096, 915
Country
Sex
Population
Australia
Australia
Australia
Denmark
Denmark
Denmark
Germany
Germany
Germany
Netherlands
Netherlands
Netherlands
United States
United States
United States
all
all
all
male
female
all
male
female
all
male
female
all
male
female
all
male
female
all
male
female
all
9,913, 658
9,999, 486
19,913, 144
2,676, 377
2,737, 015
5,413, 392
40,413, 132
42,011, 477
82,424, 609
8,079, 392
8,238, 807
16,318, 199
143,957, 558
149,070, 013
293,027, 571
205,040, 117
212,056, 798
417,096, 915
1.2
OLAP Terminology
Suppose an analyst wishes to see a population cross tab on countries and sex for a fixed
value of the size of the states of the respective countries, for example, 10, 000 km2
instead of the sum across all states:
. Such an operation is referred to as slicing.
. If values from multiple dimensions are fixed, the operation is called dicing.
The opposite direction that of moving from coarse granularity data to fine granularity data is called drill down.
Extended Aggregation
2.1
2.2
GROUP BY and GROUPING SETS statements are used to group individual rows into
combined sets based on the value in one, or more, columns.
2.2.1
Cube Operation
CUBE operation computes union of GROUP BYs on every subset of the specified attributes.
For each grouping, the result contains the null value for attributes not present in
the grouping.
Query above computes the relational representation of the population cross tab that
we saw earlier.
The function grouping() can be used to identify what rows come from which particular grouping set.
. A value of 1 indicates that the corresponding data field is null because the row is
from of a grouping set that does not involve this row.
. Otherwise, the value is zero.
Example:
SELECT country, sex, sum(population),
grouping(country) AS country_flag,
grouping(sex) AS sex_flag,
FROM population
GROUP BY CUBE(country, sex);
You can use the CASE expression in the SELECT clause to replace such nulls (presented
as -) by a value such as all.
2.2.2
Rollup Operation
Suppose their exists the dimension stretch in the population relation which can be
used to aggregate by town, state, and country.
10
11
2.2.3
GROUPING SETS statement enables us to get multiple GROUP BY result sets using a
single statement.
12
Example:
GROUP BY
GROUPING SETS
((year,country,sex))
GROUP BY
GROUPING SETS
(year,country,sex)
GROUP
UNION
GROUP
UNION
GROUP
GROUP BY
GROUPING SETS
(year,(country,sex))
GROUP BY year
UNION ALL
GROUP BY country, sex
BY year
ALL
BY country
ALL
BY sex
13
Multiple GROUPING SETS in the same GROUP BY are combined together as if they
were simple fields in a GROUP BY list.
Example:
GROUP BY
GROUPING SETS (year),
GROUPING SETS (country),
GROUPING SETS (sex)
GROUP BY
GROUPING SETS (year),
GROUPING SETS ((country,sex))
GROUP BY
GROUPING SETS (year),
GROUPING SETS (country,sex)
14
ROLLUP and CUBE statements are short-hand forms of particular types of GROUPING SETS statement.
GROUP BY
GROUPING SETS((year,country,sex),
(year,country),(year), ())
CUBE expression displays a cross tab of the sub-totals for any specified fields. Example:
GROUP BY
CUBE(year, country, sex)
GROUP BY
GROUPING SETS((year,country,sex),
(year,country),
(year,sex),
(country,sex),
(year),
(country),
(sex),
())
15
Ranking
Given the relation population(country, number) find the rank of each country.
SELECT country, rank() OVER (ORDER BY number DESC) AS n_rank
FROM population
16
Example query: Find the rank of the countries within each sex in terms of their
population size.
SELECT country, rank() OVER (PARTITION BY sex ORDER BY number DESC) AS n_rank
FROM population
ORDER BY sex, n_rank
17
When writing the ORDER BY clause, one can specify whether to count null values as
high or low.
. The default, for an ascending field is that they are counted as high (i.e. come last),
and for a descending field, that they are counted as low:
. Example:
SELECT country, rank() OVER (ORDER BY number DESC NULLS LAST)
AS n_rank
FROM population
ORDER BY n_rank
18
Windowing
For example:
. Given population values for each country and year, calculate the average population
rate for each country and year on the basis of the current, previous, and next year.
. Query in SQL:
SELECT country, year,
avg(population) OVER (ORDER BY country, year ROWS BETWEEN 1 PRECEDING
AND 1 FOLLOWING) AS p_avg
FROM population
order by country, year, p_avg;
19
For example:
. Find the average male and female population rate for each country and year on
the basis of the current, previous, and next year.
. Query in SQL:
SELECT country, sex, year,
avg(population) OVER (PARTITION BY sex ORDER BY name, year ROWS
BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS p_avg
FROM population
ORDER BY country, sex, year, p_avg;
20