02 Olap
02 Olap
What is OLAP ?
• OLAP = Online Analytical Processing
• Support (almost) ad-hoc querying for business
analyst
• Think in terms of spreadsheets
– View sales data by geography, time, or product
• Extend spreadsheet analysis model to work with
warehouse data
– Large data sets
– Semantically enriched to understand business terms
– Combine interactive queries with reporting functions
What is OLAP ?
• Reactive Analysis tool
• Provides Multidimensional view of data
• Works with the concept of cubes
• Information via "slice, dice and rotate" method
• Various models available
MOLAP
ROLAP
HOLAP
OLAP Models
OLAP Models
• Relational OLAP (ROLAP):
• Extended relational DBMS that maps
operations on multidimensional data to
standard relations operations
• Store all information, including fact tables,
as relations
• Multidimensional OLAP (MOLAP):
• Special purpose server that directly
implements multidimensional data and
operations
• store multidimensional datasets as arrays
OLAP Models
Cubes allows
• Rapid analytical access
• Spares end users from writing language-based queries
The Sales Cube
(Products.Clothing, Location.Delhi,Time. 98,Measures.Sales)
(Products.Clothing, Location.Mumbai,Time.97,Measures.Sales)
(Products.Groceries, Location.Pune,Time.95,Measures.Sales)
Groceries
Appliances
Clothing
95
96
Time
97
98 Product
99
Mumbai
Pune
Delhi
Location
attaa
naii
nna
katt
OLAP vs. Data Warehouse
• OLAP and Data Warehouses are complementary.
• A Data Warehouse stores and manages data. OLAP transforms Data Warehouse data into
strategic information.
Sales
Product
Warehouse
Budget
Store Sales
Typical OLAP Operations
• Roll up (drill-up): summarize data
– by climbing up hierarchy or by dimension reduction
• Drill down (roll down): reverse of roll-up
– from higher level summary to lower level summary or detailed data, or
introducing new dimensions
• Slice and dice:
– project and select
• Pivot (rotate):
– reorient the cube, visualization, 3D to series of 2D planes.
• Other operations
– drill across: involving (across) more than one fact table
– drill through: through the bottom level of the cube to its back-end
relational tables (using SQL)
OLAP Queries
• Roll up: summarize data along a
dimension hierarchy
• if we are given total sales volume per city we
can aggregate on the Location to obtain sales
per states
OLAP Queries
client
city
New c1 10 3 21
Orleans c2 12 5 9
11 7 7
region
c3
Poznań 12 11 15 Date of
c4 sale
video CD
Camera
Video Camera CD
roll up NO
PN
22
23
8
18
30
22
OLAP Queries
c1 c2 c3
day 2
p1 44 4
p2 c1 c2 c3
p1 12 50
day 1 p2 11 8
c1 c2 c3
c1 c2 c3 sum 67 12 50
p1 56 4 50
p2 11 8
129
rollup sum
p1 110
drill-down p2 19
OLAP Queries
• Slice and dice: select and project
• Sales of video in USA over the last 6 months
• Slicing and dicing reduce the number of
dimensions
• Pivot: reorient cube
• The result of pivoting is called a cross-
tabulation
• If we pivot the Sales cube on the Client and
Product dimensions, we obtain a table for
each client for each product value
OLAP Queries
• Pivoting can be combined with aggregation
sale prodId clientid date amt
p1 c1 1 12
p2 c1 1 11
c1 c2 c3
p1 c3 1 50 day 2
p2 c2 1 8 p1 44 4
p1 c1 2 44 p2 c1 c2 c3
p1 c2 2 4 day 1
p1 12 50
p2 11 8
c1 c2 c3 Sum c1 c2 c3 Sum
1 23 8 50 81 p1 56 4 50 110
2 44 4 48 p2 11 8 19
Sum 67 12 50 129 Sum 67 12 50 129
Groceries
OLAP- Slice Appliances
Clothing
95
96
97
Slice for
Year = 98 98
99
Mumbai
Pune
De lhi
Groceries
Kol katta
nnaii
Appliances
Chenna
Che
Clothing
Mumbai
Pune
Delhi
attaa
nai
katt
Groceries
OLAP- Dice Appliances
Clothing
95
96
Dice for 97
Groceries, Clothing
98
Year = 97 & 98 and
Sales & Cost 99
Mumbai
Pune
De lhi
Groceries
Clothing
Kol katta
nnaii
Chenna
97
Che
98
Mumbai
Pune
OLAP Queries
• Ranking: selection of first n elements (e.g. select 5
best purchased products in July)
• Others: stored procedures, selection, etc.
• Time functions
– e.g., time average
Multi-Dimensional View
of Data
Finance Operations
Profit Volume
by Division by Plant
by Country by Shift
by Month Sales Marketing by Product
by Actual/Budget Revenue Revenue by Day
by Product by Customer
by Region by Industry
by Sales Rep by Channel
by Quarter by Week
Multidimensional Data Model
c1 c2 c3 Sum
p1 56 4 50 110
p2 11 8 19
Sum 67 12 50 129
Aggregates
• Operators: sum, count, max, min,
median, ave
• “Having” clause
• Using dimension hierarchy
– average by region (within store)
– maximum by month (within date)
Cube Aggregation
c1 c2 c3
c1 c2 c3 sum 67 12 50
p1 56 4 50
p2 11 8
129
sum
p1 110
p2 19
Cube
* c1 c2 c3 *
p1 56 4 50 110
p2 11 8 19
day 2 c1* 67
c2 c312 * 50 129
p1 44 4 48
p2
c1 c2 c3 *
day 1
p1 *
12 44 4
50 62 48 sale(*,p2,*)
p2 11 8 19
* 23 8 50 81
Aggregation Using Hierarchies
c1 c2 c3
day 2
p1 44 4
customer
p2 c1 c2 c3
day 1
p1 12 50 region
p2 11 8
country
re g io n A re g io n B
p1 12 50
p2 11 8
(customer c1 in Region A;
customers c2, c3 in Region B)
Aggregation Using Hierarchies
client
city
New c1 10 3 21
Orleans c2 12 5 9
11 7 7
region
c3
Poznań 12 11 15 Date of
c4 sale
video CD
Camera
Video Camera CD
aggregation with NO 22 8 30
respect to city PN 23 18 22
A Sample Data Cube
Date
1Q 2Q 3Q 4Q sum
c t camera
C
du video USA
o
Pr CD o
sum
Canada u
n
Mexico t
r
sum
y
Exercise
• Suppose the AAA Automobile Co. builds a data
warehouse to analyze sales of its cars.
• The measure - price of a car
• Manufacturing department
Production planning
Defect analysis.
Key features of OLAP
• Multidimensional views of data
Ability to "slice and dice";
View financial data by scenario (for example, actual vs. budget), organization,
line items, and time
View sales data by product, geography, channel, and time.
• Calculation-intensive capabilities
Share calculations (percentage of total)
Moving averages and percentage growth
• Time intelligence
This year's vs. last year's
This month vs. the same month last year.
• Aggregation
Allows pre-aggregated Values
• Drill Down and Drill Up
• Pivot
Questions