0% found this document useful (0 votes)
12 views

10 Chapter10+ +Building+the+Data+Warehouse+ +part+1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

10 Chapter10+ +Building+the+Data+Warehouse+ +part+1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

1 Overview

Data Warehouse Overview

Building the Data 2 Data Process


Overview of the Data Process

Warehouse 3 Database
- Part 1
Building the Azure Sql Database

4 Staging Master Data


Building the Staging Layer Master Data

5 Staging Transactions
Building Staging Layer Transactional Data

MODULE 10
SECTION 1
Overview
Overview
Data Warehouse Overview

Single source of truth to store data from multiple sources for


historical and analytical reporting Data Quality
One version of the truth

Improve data quality from source systems

02 03 Provide business friendly naming conventions


01

Data Integration and Isolation Data Analysis


Integrate many sources of data Optimized for read access

Isolate from production transaction systems Build hierarchies to analyze data

Protect from source system upgrades Enable self-serve data analysis for business users
Overview
Data Warehouse Models

Staging Data Mart Cube


Inmon Model
• Enterprise data model (CIF) or enterprise data warehouse (EDW)
OLTP Data Data Reporting
Sources Warehouse
(Normalized)
Layer • IT Driven, users have passive participation
Staging Data Mart Tabular

• Centralized atomic normalized tables

• Dependent data marts that are separate physical subsets of EDW

• Logical data warehouse with subject area - data marts

• Business driven, users have active participation

• Decentralized data marts Staging Cube

• Independent dimensional data marts for reporting/analytics OLTP Data


Sources
Data Mart
Reporting
Layer

Kimball Model
Staging

Data Mart Tabular


Overview
Data Warehouse Vino World
Star Schema Data Model
• Facts and Dimensions, star schema
Dim Date Dim Store • Denormalized Fact Sales Table with Measures
• Integrated via Conformed Dimensions providing consistency
across data sources
• Slowly changing dimensions with surrogate keys
FACT • Business friendly for direct end-user data access
Sales
What questions can be
Dim Territory Dim Currency
answered?
• What are the total sales?
• What is the gross profit?
• What are the total sales by store?
Dim Product • What are the total sales by product?
• What are the total sales by Territory?
• Which month had the highest sales and for which product?
• Which product is the most profitable?
SECTION 2
Data Process
Data Process
Overview of the Data Process
Dim Dim
Data Processing
Date Store
• Integrate multiple data sources
FACT
• Stage the data
OLTP Sales
Data
Sources
Staging Dim Dim • Scrub the data due to data quality issues
Territory Currency
• Transform the data and load into the Data Warehouse

Dim
Product

Extract, Transform, and Load (ETL) Extract, Load, and Transform (ELT)
• No Staging Tables • Uses Staging Tables

• Transform the data while hitting the source system • Transform the data while loading into target system

• Processing done by the ETL Tools • Processing done by target database engine

ELT is better suited for large volumes of data and for a modern data warehouse
architecture
Data Process
Overview of the ELT Process • Load transaction/master files to cleansed
zone

• Scrub files for data quality issues

• Harmonize data for consistent format

• Load cleansed data into Staging Layer

1 Load 3

Extract Transform
• Extract source files to raw zone 2 • Transform data to dimensional model

• Load to data warehouse


• Transaction files
• Build dimensions
• Master data files

• Build Facts
SECTION 3
Database
Database
Overview of Database Objects

Dim Date Dim Store

FACT
Sales
OLTP Data Sources
Staging
Dim Territory Dim Currency

Dim Product

Staging Data Warehouse


◼ stage.Dates ◼ stage.Verde_Products ◼ stage.Celeste_Sales ◼ dimDate ◼ factSales
◼ stage.Currency ◼ stage.Arancione_Products ◼ stage.Sales ◼ dimCurrency
◼ stage.Product ◼ dimProduct
◼ stage.Celeste_Products
◼ stage.Store ◼ dimStore
◼ stage.Verde_Sales
◼ Stage.Territory
◼ dimTerritory
◼ Stage.Arancione_Sales

We will build staging tables and data warehouse tables in Azure Sql DB
SECTION 4
Staging Master Data
Staging Master Data
Building the Staging Layer Master Data
Dim Dim
Landing Raw Cleansed Date Store

FACT
Source Sales
Data Staging Dim Dim
Territory Currency

Dim
Product
On Premise
Azure Data Lake Storage Azure Sql DB (Stage) Azure Sql DB (DW)
Source Data

Source from ADLS Load to Stage – Azure Sql DB


• Source scrubbed data from cleansed zone • Load Master data from ADLS cleansed zone

• Master Data • Load data into stage tables in Azure Sql DB

In this step we will load master data from the cleansed container in Azure Storage to
stage tables in Azure Sql DB
SECTION 5
Staging Transactions
Staging Transactions
Building the Staging Layer Transactions
Dim Dim
Landing Raw Cleansed Date Store

FACT
Source Sales
Data Staging Dim Dim
Territory Currency

Dim
Product
On Premise
Azure Data Lake Storage Azure Sql DB (Stage) Azure Sql DB (DW)
Source Data

Source from ADLS Load to Stage – Azure Sql DB


• Source scrubbed data from cleansed zone • Load Transaction data from ADLS cleansed zone

• Transaction Data • Load data into stage sales table in Azure Sql DB

In this step we will load transaction data from the cleansed container in Azure
Storage to stage tables in Azure Sql DB
Module Summary
In this module we learnt

Overview Integration Hands-On


We got an overview of the Data Warehouse We learnt about the Star Schema Model we We learnt how to build the Azure Sql
and its benefits. will use for our Data Warehouse database objects

We learnt about the different Data We learnt about the ELT approach that we We then learnt how to stage our master data
Warehouse approaches and data processing will use to build the Data Warehouse and transactional data
approaches
References
Dimensional Modeling – Ralph Kimball
A Dimensional Modeling Manifesto - Kimball Group

ETL/ELT
From Warehouse To Lakehouse – ELT/ETL Design Patterns With Azure Data Services – SQL
Of The North

Modern Data Warehouse


The Modern Data Warehouse | James Serra's Blog

You might also like