Week5
Week5
Data Warehouse
Models
Sonny Boy M. Sasis
IT Faculty
Cagayan State University
Learning Objectives
At the end of this unit, the student is Compare and contrast the
expected to: 5 advantages and disadvantages of
each schema type
Distinguish Dimensional
1 Modeling
Fundamentals of Dimensional
1 Modeling
used to generate data for these 1 100 ABC Apprentice Surgeon 45000 75000
E.g. The time dimension-it contains day, week, month, quarter, Surrogate Key/Synthetic Key is a key that is used as a primary key in
year, decade etc. the dimension table in data warehouse.
Types of Facts in DWH E.g. Suppose a pharmaceutical company ‘SUN” manufactures ten
types of drugs and the profit margin for each of the drugs is a non-
additive fact.
Example of Dimensions: The CEO at an MNC wants to find the 5) Build Schema. In this step, you
sales for specific products in different locations on a daily basis.
implement the Dimension Model.
Dimensions: Product, Location and Time
Attributes: For Product: Product key (Foreign Key), Name, Type,
Specifications A schema is nothing but the database
Hierarchies: For Location: Country, State, City, Street Address,
Name
structure (arrangement of tables).
● It is a logical description of the entire
database.
● It includes the name and description of
records of all record types including all
associated data-Items and aggregates.
Disadvantage: Data Redundancy - values
Types of DWH Schema may be repeated in some instance like city,
province_or_state and country would be
● Star Schema is the basic form of a
repeated for two streets in the same city.
dimensional model, in which data are
Advantages:
organized into facts and dimensions. It
1. Simplest and Easiest
is called a star schema because the
2. It optimizes navigation through
diagram resembles a star, with points
database
radiating from a center. The center of
3. Most suitable for Query Processing
the star consists of the fact table, and
4. Faster performance as there are less
the points of the star is dimension
number of joins required
tables.
Types of DWH Schema SNOWFLAKE SCHEMA
● Snowflake Schema is an extension of
the star schema. In a snowflake
schema, each dimension are normalized
and connected to more dimension
tables. It is named snowflake because it
looks like a snowflake because of
decomposition of one de-normalized
dimension into many normalized
dimensions.
Disadvantages:
1. Slow Performance - too many joins
required to form the result.
2. It is a complex schema
Advantages:
1. Less redundancies due to
normalization of dimension tables.
2. Dimension Tables are easier to update.
Types of DWH Schema
● Galaxy Schema/Fact Constellation
Schema is the collection of multiple star
schemas in which multiple facts are
connected to their respective
dimensions. The resemblance of the
collection of star schemas looks like a
galaxy that’s why it is called galaxy
schema.
Disadvantages:
1. Complex due to multiple fact tables.
2. It is difficult to manage.
3. Dimension Tables are very large.
Advantages:
1. Ensures data reusability.
2. Guarantees referential integrity.
Benefits of Dimensional
Modeling
● Standardization of dimensions allows easy ● Dimensional models are deformalized and
reporting across areas of the business. optimized for fast data querying. Many relational
● Dimension tables store the history of the database platforms recognize this model and
dimensional information. optimize query execution plans to aid in
● It allows to introduce entirely new dimension performance.
without major disruptions to the fact table. ● Dimensional modelling in data warehouse creates
● Compared to the normalized model a schema which is optimized for high
dimensional table are easier to understand. performance. It means fewer joins and helps with
● Information is grouped into clear and simple minimized data redundancy.
business categories. ● Dimensional models can comfortably
● The dimensional model is very understandable accommodate change. Dimension tables can
by the business. This model is based on have more columns added to them without
business terms, so that the business knows affecting existing business intelligence
what each fact, dimension, or attribute means. applications using these tables.
● The dimensional model also helps boost query
performance. It is more denormalized;
therefore, it is optimized for querying.
Summary
● A dimensional model is a data ● Types of Dimensions are Conformed,
structure technique optimized for Junk/Dirty, Degenerate, Static, and
Data warehousing tools. Slowly Changing Dimensions.
● Facts are the measurements/metrics ● There are 4 types of facts: (1) Additive
or facts from your business process. (2) Non-additive (3) Semi-additive (4)
● Dimension provides the context Factless Facts
surrounding a business process ● Five steps of Dimensional modeling are
event. (1) Identify Business Process (2)
● Attributes are the various Identify Grain (level of detail) (3)
characteristics of the dimension Identify Dimensions (4) Identify Facts
modelling. (5) Build Schema
● Measures are numeric data based on ● A schema is nothing but the database
columns in a fact table. structure (arrangement of tables).
● A fact table is a primary table in a ● The three (3) types of schema are: (1)
dimensional model. Star (2) Snowflakes (3) Fact
● A dimension table contains Constellation/Galaxy Schema
dimensions of a fact.
Your Turn!
Let’s do the “Two Truths and a Lie”
learning activity. Thank you for
● Individual Reflection: Write down
three statements about the lesson,
participating!
“
ensuring one is a false statement.
https://tinyurl.com/Dimensional-
3
Kimball, R. The Data Warehouse 6 Modeling
Toolkit. 3rd Edition