CSIS 3300 W3 Denormalization StarSchema
CSIS 3300 W3 Denormalization StarSchema
STAR SCHEMA
NIKHIL BHARDWAJ
he 7Ws Framework
Lawrence Corr, Jim Stagnitto
p://www.dama-phila.org/JS20120509.pdf)
Fact Table
Date CustKey ProdKey Item Count Amount
1/7/2014 1552 95 1 1,798.00
3/2/2014 1552 37 1 27.95
5/7/2015 1552 87 2 320.26
2/21/2016 1552 2387 42 1 19.95
• One row for every day for which you expect to have
data for the fact table (perhaps generated in a
spreadsheet and imported)
• Usually use a meaningful integer surrogate key (such
as yyyymmdd 20160926 for Sep. 26, 2016). Note:
this order sorts correctly.
• Include rows for missing or future dates to be added
later.
DEGENERATE DIMENSIONS
Combin
e
DENORMALIZATION
Expand /
Calculate
1.Denormalize and add region (e.g. NA, EMEA)
2.Denormalize and add location data based on IP
DENORMALIZATION -
LAB
STAR SCHEMA
• Simpler queries
• Simplified business logic due to extra attributes
stored right into dimensions e.g. instead of just
storing date time, all attributes such as week,
weekday, month, quarter, year etc are stored and
later business analyst can run query on any of these
attributes without the need to calculate these
derived attributes in the application.
Star Schema for Foodmart
SNOWFLAKE SCHEMA
• For M:N relationship snowflake schema is better which
utilizes intersection table to join the M:N relationship.
• In snowflake schema we keep the main dimension
close to the fact table and the second dimension is
joined to the main dimension via intersection table
• E.g. if we design a schema for a library system which has
book and author as M:N relationship, we might consider
connecting the book dimension to the fact table. And The
author dimension can be joined to the book dimension via
BookAuthor dimension table. More on this in the case study
MULTIPLE FACT TABLES