DATAWAREHOUSE PPT NEWW
DATAWAREHOUSE PPT NEWW
warehouse:
• They store current and historical data in one single place and are used
for creating analytical reports for knowledge workers throughout the
enterprise.
2
Design of Data Warehouse:
There are 3 strategies of implementing a Data Warehouse
Bottom Up Design:
In the bottom-up approach, data marts are first created to provide reporting and
analytical capabilities for specific business processes.
Top Down Design:
The top-down approach is designed using "Atomic" data, that is, data at the
greatest level of detail, are stored in the data warehouse.
Hybrid Design:
A hybrid DW database is kept on third normal form to eliminate data
redundancy and makes use of features of both the above mentioned designs.
3
Implementation of Database
They follows the process of ETL i.e. Extraction, Transform and Loading.
6
Star and Snowflake
Schema
Star Schema
• The star schema architecture is the simplest data warehouse schema.
• The center of the star consists of fact table and the points of the star
are the dimension tables.
8
Star
Schema
Example
9
The main characteristics of star schema:
• Simple structure
• Great query effectives
• Relatively long time of loading data into dimension tables
10
Snowflake Schema
• The snowflake schema architecture is a more complex variation of the
star schema used in a data warehouse, because the tables which
describe the dimensions are normalized.
12
Slowly Changing Dimensions
• Dimension attributes that change slowly over a period rather than changing regularly.
• Data captured by Slowly Changing Dimensions (SCDs) change slowly but unpredictably,
rather than according to a regular schedule.
E.g. Transfer of a person causing a change in his regional office's id.
• These scenarios can sometimes cause referential integrity problems and can be dealt
with many methodologies:
• Type 0: retain original
• Type 1: overwrite
• Type 2: add new row
• Type 3: add new attribute
• Type 4: add history table
• Type 6: (1+2+3)
13
Examples
• Type 1:
• Type 2:
• Type 3:
14
Contd.
• Type 4:
• Type 6:
15
Dimension Relationships
Dimension Relationships
• A relationship between a dimension and a measure group consists of
the dimension and fact tables participating in the relationship and a
granularity attribute that specifies the granularity of the dimension in
the particular measure group.
• Types of Dimensions -
1.Conformed dimensions
2.Junk dimensions
3.Role Playing dimensions
4.Degenerate dimensions
17
Conformed and Junk Dimensions
•Conformed dimension
–Shared by multiple fact tables.
–Used when all business users have
the same definitions for the dimension.
Figure : Conformed Dimension
• Junk dimension
–Dimension table targeted to a single fact
table.
–Used when dimensions have different
Figure : Junk Dimension
definitions for different business units.
18
Role Playing and Degenerate
Dimensions
•Role Playing dimension
–Has multiple valid relationships
with a fact table.
–Play different roles in a fact table
depending on the context. Figure : Role Playing Dimension
•Degenerate dimension
–Used by a single fact table.
–Dimension value is stored directly in
the fact table.
Figure : Degenerate Dimension
–No corresponding dimension table.
19
Facts
•Facts are the key metrics used to measure business results:
–Sales
–Production
–Inventory
•Can be additive e.g. sales
•semi-additive e.g. inventory
•non-additive e.g. profit percent
20
What are Fact Tables?
• In data warehousing, a Fact table consists of the measurements, metrics
or facts of a business process.
• It is located at the center of a star schema or a snowflake
schema surrounded by dimension tables.
• A fact table typically has two types of columns: those that contain facts
and those that are a foreign key to dimension tables. The primary key of
a fact table is usually a composite key that is made up of all of its foreign
keys.
• Fact tables contain the content of the data warehouse and store different
types of measures like additive, non additive, and semi additive
measures.
21
Figure1: Fact Table Figure2: Fact Table Example
22
Granularity
23
Various Types of Measures
1. Additive measures : Measures that can be added across all
dimensions.
24
2.Semi-additive measures
• Measures that can be added across some, but not all dimensions.
25
3.Non-additive measures
• measures that cannot be added across any dimensions.
26
Discussi
on