0% found this document useful (0 votes)
32 views

Lecture 6

This document provides an overview of dimensional modeling for data warehousing. It discusses designing a data warehouse using a star schema with facts stored in fact tables and dimensions represented in dimension tables. The fact tables contain numeric measurements and foreign keys linking to dimension tables, while dimension tables contain descriptive attributes. Dimensional modeling focuses on critical business factors and improves query performance compared to entity-relationship modeling. Star schemas simplify queries and enable fast analysis through dimensional navigation and aggregation.

Uploaded by

haz
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Lecture 6

This document provides an overview of dimensional modeling for data warehousing. It discusses designing a data warehouse using a star schema with facts stored in fact tables and dimensions represented in dimension tables. The fact tables contain numeric measurements and foreign keys linking to dimension tables, while dimension tables contain descriptive attributes. Dimensional modeling focuses on critical business factors and improves query performance compared to entity-relationship modeling. Star schemas simplify queries and enable fast analysis through dimensional navigation and aggregation.

Uploaded by

haz
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Data Warehousing &

1
Business Intel
DS-308
Course Instructor: Hamza Ali
2 Lecture 6
Dimensional Modelling

Outline:
 Design DWH
 Background (ER Modelling)
 Dimension Modelling
 Star Schemas
3
Designing DW

Information Sources Data Warehouse OLAP Servers Clients


Server (Tier 2) (Tier 3)
(Tier 1)
e.g., MOLAP
Semistructured Analysis
Sources Data serve
Warehouse
extract Query/Reporting
transform
load serve
refresh
e.g., ROLAP
Operational
DB’s serve Data Mining

Staging area Data Marts


4
Background (ER Modeling)
 For ER modeling, entities are collected from the
environment
 Each entity act as a table
 Success reasons
• Normalized after ER, since it removes
redundancy (to handle update/delete anomalies)
• But number of tables is increased
 Is useful for fast access of small amount of data
5 ER Drawbacks for DW / Need of Dimensional Modeling

 ER Hard to remember, due to increased number of


tables
 Complex for queries with multiple tables (table joins)
 Ideally no calculated attributes
 The DW does not require to update data like in
OLTP system so there is no need of normalization
 Efficient indexing scheme to avoid screening of all
data
6
Dimensional Modeling
 Dimensional Modeling focuses subject-orientation,
critical factors of business
 Critical factors are stored in facts.
 Redundancy is no problem, achieve efficiency
 Logical design technique for high performance
 Is the modeling technique for storage
7 Dimensional Modeling (cont.)
 Two important concepts
 Fact
• Numeric measurements, represent business
activity/event
• Are pre-computed, redundant
• Example: Profit, quantity sold
 Dimension
• Qualifying characteristics, perspective to a fact
• Example: date (Date, month, quarter, year)
8
Dimensional Modeling (cont.)
 Facts are stored in fact table
 Dimensions are represented by dimension tables
 Each fact is surrounded by dimension tables
 Looks like a star so called Star Schema
Example
9 PRODUCT
TIME
product_key (PK)
time_key (PK)
SKU
SQL_date
description
day_of_week
brand
month FACT
category
time_key (FK)
store_key (FK)
STORE clerk_key (FK) CUSTOMER
store_key (PK) product_key (FK) customer_key (PK)
store_ID customer_key (FK) customer_name
store_name promotion_key (FK) purchase_profile
address dollars_sold credit_profile
district units_sold address
floor_type dollars_cost

CLERK PROMOTION
clerk_key (PK) promotion_key (PK)
clerk_id promotion_name
clerk_name price_type
clerk_grade ad_type
Inside Dimensional Modeling
10  Inside Dimension table
• Key attribute of dimension table, for identification
• Large no of columns, wide table
• Non-calculated attributes, textual attributes
• Attributes are not directly related (e.g., brand and
package size)
Inside Dimensional Modeling
11  Inside Dimension table (cont.…)
• Un-normalized in Star schema
• Ability to drill-down and roll-up are two ways of
exploiting dimensions
• Can have multiple hierarchies (product category for
marketing and product category for accounting)
• Relatively small number of records
12
Inside Dimensional Modeling
 Have two types of attributes
• Key attributes, for connections
• Facts
 Inside fact table
• Concatenated key
• Grain or level of data identified
• Large number of records
• Limited attributes
• Sparse data set
• Degenerate dimensions (order number Average products per
order)
 Fact-less fact table
13
Star Schema Keys
 Surrogate keys in Dimension tables
• Replacement of primary key
• System generated
 Foreign keys in Fact tables
• Collection of primary keys of dimension tables
 Primary key in fact table
• Collection of P.Ks came from dimension tables
• Maybe degenerated dimension
• Maybe system generated surrogate key
14
Advantage of Star Schema
 Ease for users to understand
 Optimized for navigation (less joins fast)
 Most suitable for query processing (drill-down, roll-
up)
 Special techniques for join and indexing for further
query optimization
Questions?????

You might also like