0% found this document useful (0 votes)
89 views27 pages

DATAWAREHOUSE PPT NEWW

The document discusses designing and implementing a data warehouse, including describing the key components of a data warehouse like fact and dimension tables. It also covers different data warehouse designs like star schemas and snowflake schemas, and how to handle slowly changing dimensions. The goal is to extract, transform, and load data from source systems into dimensional models to enable business intelligence reporting and analysis.

Uploaded by

Mony Toppo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views27 pages

DATAWAREHOUSE PPT NEWW

The document discusses designing and implementing a data warehouse, including describing the key components of a data warehouse like fact and dimension tables. It also covers different data warehouse designs like star schemas and snowflake schemas, and how to handle slowly changing dimensions. The goal is to extract, transform, and load data from source systems into dimensional models to enable business intelligence reporting and analysis.

Uploaded by

Mony Toppo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Design and implement Data

warehouse:

TEAM MENTOR –SHAHID A,PRACHI SARANG


TEAM MEMBERS – ASWINI ,  MONY TOPPO            
         
JAN 22, 2019
Data Warehouse
• A Data warehouse (DW or DWH), also known as an enterprise data
warehouse (EDW), is a system used for reporting and data analysis, and
is considered a core component of business intelligence.

• DWs are central repositories of integrated data from one or more


sources.

• They store current and historical data in one single place and are used
for creating analytical reports for knowledge workers throughout the
enterprise.
2
Design of Data Warehouse:
There are 3 strategies of implementing a Data Warehouse
Bottom Up Design:
In the bottom-up approach, data marts are first created to provide reporting and
analytical capabilities for specific business processes.
Top Down Design:
The top-down approach is designed using "Atomic" data, that is, data at the
greatest level of detail, are stored in the data warehouse.
Hybrid Design:
A hybrid DW database is kept on third normal form to eliminate data
redundancy and makes use of features of both the above mentioned designs.
3
Implementation of Database
They follows the process of ETL i.e. Extraction, Transform and Loading.

Designing and implementation of Database includes these 7 steps:

Step 1: Determine Business Objectives


Step 2: Collect and Analyze Information
Step 3: Identify Core Business Processes
Step 4: Construct a Conceptual Data Model
Step 5: Locate Data Sources and Plan Data Transformations
Step 6: Set Tracking Duration
Step 7: Implement the Plan
4
Dimension and Fact tables
Fact Table:
• A fact table is a primary table in a dimensional model.
• A Fact Table contains
• Measurements/facts
• Foreign key to dimension table
Dimension table:
• A dimension table contains dimensions of a fact.
• They are joined to fact table via a foreign key.
• Dimension tables are de-normalized tables.
5
Contd.
Dimension attributes should be:
1. Verbose (labels consisting of full words)
2. Descriptive
3. Complete (having no missing values)
4. Discretely valued (having only one value per dimension table row)
5. Quality assured (having no misspellings or impossible values)

6
Star and Snowflake
Schema
Star Schema
• The star schema architecture is the simplest data warehouse schema.

• It is called a star schema because the diagram resembles a star, with


points radiating from a center.

• The center of the star consists of fact table and the points of the star
are the dimension tables.

8
Star
Schema
Example

9
The main characteristics of star schema:

• Simple structure
• Great query effectives
• Relatively long time of loading data into dimension tables

10
Snowflake Schema
• The snowflake schema architecture is a more complex variation of the
star schema used in a data warehouse, because the tables which
describe the dimensions are normalized.

• The snowflake schema is represented by centralized fact tables which


are connected to multiple dimensions.

• In the snowflake schema, dimensions are normalized into multiple


related tables,whereas the star schema’s dimensions are de-normalized
with each dimension represented by a single table
11
Snowflak
e
Schema

12
Slowly Changing Dimensions
• Dimension attributes that change slowly over a period rather than changing regularly.
• Data captured by Slowly Changing Dimensions (SCDs) change slowly but unpredictably,
rather than according to a regular schedule.
E.g. Transfer of a person causing a change in his regional office's id.

• These scenarios can sometimes cause referential integrity problems and can be dealt
with many methodologies:
• Type 0: retain original
• Type 1: overwrite
• Type 2: add new row
• Type 3: add new attribute
• Type 4: add history table
• Type 6: (1+2+3)

13
Examples
• Type 1:

• Type 2:

• Type 3:

14
Contd.
• Type 4:

• Type 6:

15
Dimension Relationships
Dimension Relationships
• A relationship between a dimension and a measure group consists of
the dimension and fact tables participating in the relationship and a
granularity attribute that specifies the granularity of the dimension in
the particular measure group.
• Types of Dimensions -
1.Conformed dimensions
2.Junk dimensions
3.Role Playing dimensions
4.Degenerate dimensions

17
Conformed and Junk Dimensions
•Conformed dimension
–Shared by multiple fact tables.
–Used when all business users have
the same definitions for the dimension.
Figure : Conformed Dimension
• Junk dimension
–Dimension table targeted to a single fact
table.
–Used when dimensions have different
Figure : Junk Dimension
definitions for different business units.
18
Role Playing and Degenerate
Dimensions
•Role Playing dimension
–Has multiple valid relationships
with a fact table.
–Play different roles in a fact table
depending on the context. Figure : Role Playing Dimension

•Degenerate dimension
–Used by a single fact table.
–Dimension value is stored directly in
the fact table.
Figure : Degenerate Dimension
–No corresponding dimension table.
19
 Facts
•Facts are the key metrics used to measure business results:
–Sales
–Production
–Inventory
•Can be additive e.g. sales
•semi-additive e.g. inventory
•non-additive e.g. profit percent

20
What are Fact Tables?
• In data warehousing, a Fact table consists of the measurements, metrics
or facts of a business process.
• It is located at the center of a star schema or a snowflake
schema surrounded by dimension tables.
• A fact table typically has two types of columns: those that contain facts
and those that are a foreign key to dimension tables. The primary key of
a fact table is usually a composite key that is made up of all of its foreign
keys.
• Fact tables contain the content of the data warehouse and store different
types of measures like additive, non additive, and semi additive
measures.
21
Figure1: Fact Table Figure2: Fact Table Example

22
Granularity

•Granularity refers to the level of detail in which facts are recorded.


•Facts can be at different levels of granularity.
• Granularity is determined based on business needs.
•It is the lowest level of information stored in the fact table.
Example- year , month , quarter, period , week ,day (date dimension)

23
Various Types of Measures
1. Additive measures : Measures that can be added across all
dimensions.

24
2.Semi-additive measures

• Measures that can be added across some, but not all dimensions.

25
3.Non-additive measures
• measures that cannot be added across any dimensions.

26
Discussi
on

You might also like