Worksheet in Datawarehousing Case Study - Building ETL Processes
Worksheet in Datawarehousing Case Study - Building ETL Processes
FIRST_NAME STRING(40)
MIDDLE_NAME STRING(40)
LAST_NAME STRING(40)
SSN STRING(9)
STREET_NAME,APT_NO STRING(30),String(7)
CUST_CITY STRING(30)
CUST_STATE STRING(30)
CUST_COUNTRY STRING(30)
CUST_ZIP STRING(7)
CUST_PHONE NUMBER(10)
CUST_EMAIL STRING(40)
- -
- -
Target Table/File name Target Field names Target DataType
CDW_SAPP_D_SUPPLIER SUPPLIER_ID NUMBER(10)
SUPPLIER_NAME STRING(50)
SUPPLIER_SSN STRING(9)
SUPPLIER_PHONE NUMBER(10)
SUPPLIER_LOC STRING(30)
Target Table/File name Target Field names Target DataType
CDW_SAPP_D_PRODUCT PRODUCT_CODE NUMBER(10)
Primary key of the Supplier ( Foreign Key for Join with Supplier table, based on SSN from
this table) Supplier Table
and load the corresponding Supplier ID
Price of the product Direct move
Source File Name Source File Column Names Source Data Type
CDW_SAPP_PRODUCT.txt PRODUCT_CODE NUMBER(10)
Zip postal code(default value 999999) If the source value is null load default value else
Direct move
Phone number of the branch Change the format of phone number to (XXX)XXX-
XXXX
Source File Name Source File Column Names Source Data Type
CDW_SAPP_BRANCH.txt BRANCH_CODE NUMBER(9)
Description
Identifies the record in which the error ocured
uniquely identifies the error record
Defines the Error
Column name in which the error occurred
Mapping logic
seq NUMBER
Look up from Time dim table and load the TIMEID
Look up from Time dim table and load the TIMEID
Look up from Time dim table and load the TIMEID
look up from customer dim table.
look up from supplier dim table
Look up Branch Code using Branch name from the Location Dimension Table
Direct move
Look up Product Code using Product Name from the Product Dimension Table
Direct move
Direct move from the file/check for decimal values
Need to be calculated from the product table data (Price of product and sales_qty).
sysdate
workflow name
Mapping logic
Need to load the primary key of record (CDW_P_SALES_DSET_KEY) in which error o
Generate the Error code based on the error.
Decode the Error code and set a relavent message
Load the name of the column in which has error
Source File Name Source File Column Names Source Data Type
- - -
CDW_SAPP_F_SALES_BR_XX DAY NUMBER(2)
CDW_SAPP_F_SALES_BR_XX MONTH NUMBER(2)
CDW_SAPP_F_SALES_BR_XX YEAR NUMBER(4)
CDW_SAPP_F_SALES_BR_XX CUSTOMER_SSN NUMBER(9)
CDW_SAPP_F_SALES_BR_XX SUPPLIER_SSN NUMBER(9)
CDW_SAPP_F_SALES_BR_XX BRANCH_NAME STRING(25)
CDW_SAPP_F_SALES_BR_XX BRANCH_NAME STRING(25)
CDW_SAPP_F_SALES_BR_XX PRODUCT_NAME STRING(30)
CDW_SAPP_F_SALES_BR_XX PRODUCT_NAME STRING(30)
CDW_SAPP_F_SALES_BR_XX QUANTITY_SOLD NUMBER(7)
CDW_SAPP_F_SALES_BR_XX QUANTITY_SOLD NUMBER(7)
CDW_SAPP_F_SALES_BR_XX NA NA
CDW_SAPP_F_SALES_BR_XX NA NA
Source File Name Source File Column Names Source Data Type
- - -
- - -
- - -
CDW_SAPP_F_SALES_BR_XX - -
Target Table/File name Target Field names Target DataType
CDW_SAPP_STG_SALES CDW_P_SALES_DSET_KEY NUMBER(9)
CDW_SAPP_STG_SALES SALES_F_PERIOD_KEY VARCHAR2(8)
CDW_SAPP_STG_SALES SALES_F_PERIOD_KEY VARCHAR2(8)
CDW_SAPP_STG_SALES SALES_F_PERIOD_KEY VARCHAR2(8)
CDW_SAPP_STG_SALES SALES_F_CUSTOMER_KEY NUMBER(10)
CDW_SAPP_STG_SALES SALES_F_SUPPLIER_KEY NUMBER(10)
CDW_SAPP_STG_SALES SALES_F_BRANCH_CODE NUMBER(9)
CDW_SAPP_STG_SALES SALES_F_BRANCH_NAME VARCHAR2(25)
CDW_SAPP_STG_SALES SALES_F_PRODUCT_CODE NUMBER(10)
CDW_SAPP_STG_SALES SALES_F_PRODUCT_NAME VARCHAR2(30)
CDW_SAPP_STG_SALES SALES_SOLD_QTY NUMBER(7)
CDW_SAPP_STG_SALES SALES_TOTAL_AMOUNT NUMBER(8,2)
CDW_SAPP_STG_SALES CREATED_DATE TIMESTAMP(0)
CDW_SAPP_STG_SALES CREATED_BY VARCHAR2(40)
Description
Identifies the record in which the error ocured
uniquely identifies the error record
Defines the Error
Column name in which the error occurred
Mapping logic
seq NUMBER
Look up from Time dim table and load the time key.
Look up from Time dim table and load the time key.
Look up from Time dim table and load the time key.
look up from customer dim table.
look up from supplier dim table
look up from Location dim table
Direct move
look up from product code
Direct move
Direct move from the file/check for decimal values
Need to be calculated from the product table data (Price of product)and sales_qty.
sysdate
workflow name
Mapping logic
Need to load the primary key of record (CDW_P_SALES_DSET_KEY) in which error o
Generate the Error code based on the error.
Decode the Error code and set a relavent message
Load the name of the column in which has error
Source File Name Source File Column Names Source Data Type
- - -
CDW_SAPP_F_SALES_BR_XX DAY NUMBER(2)
CDW_SAPP_F_SALES_BR_XX MONTH NUMBER(2)
CDW_SAPP_F_SALES_BR_XX YEAR NUMBER(4)
CDW_SAPP_F_SALES_BR_XX CUSTOMER_SSN NUMBER(9)
CDW_SAPP_F_SALES_BR_XX SUPPLIER_SSN NUMBER(9)
CDW_SAPP_F_SALES_BR_XX BRANCH_NAME STRING(25)
CDW_SAPP_F_SALES_BR_XX BRANCH_NAME STRING(25)
CDW_SAPP_F_SALES_BR_XX PRODUCT_NAME STRING(30)
CDW_SAPP_F_SALES_BR_XX PRODUCT_NAME STRING(30)
CDW_SAPP_F_SALES_BR_XX QUANTITY_SOLD NUMBER(7,2)
CDW_SAPP_F_SALES_BR_XX QUANTITY_SOLD NUMBER(7)
CDW_SAPP_F_SALES_BR_XX NA NA
CDW_SAPP_F_SALES_BR_XX NA NA
Source File Name Source File Column Names Source Data Type
- - -
- - -
- - -
CDW_SAPP_F_SALES_BR_XX - -
Target Table/File name Target Field names Target DataType
CDW_SAPP_F_SALES CDW_XYZ_F_SALES_DSET_KEY NUMBER(9)
CDW_SAPP_F_SALES SALES_F_PERIOD_KEY DATE
CDW_SAPP_F_SALES SALES_F_CUSTOMER_Key NUMBER(10)
CDW_SAPP_F_SALES SALES_F_SUPPLIER_Key NUMBER(10)
CDW_SAPP_F_SALES SALES_F_BRANCH_CODE NUMBER(9)
CDW_SAPP_F_SALES SALES_F_BRANCH_NAME VARCHAR2(25)
CDW_SAPP_F_SALES SALES_F_PRODUCT_CODE NUMBER(10)
CDW_SAPP_F_SALES SALES_F_PRODUCT_NAME VARCHAR2(30)
CDW_SAPP_F_SALES SALES_SOLD_QTY NUMBER(7)
CDW_SAPP_F_SALES SALES_TOTAL_AMOUNT NUMBER(8,2)
CDW_SAPP_F_SALES CREATED_DATE TIMESTAMP(0)
CDW_SAPP_F_SALES CREATED_BY VARCHAR2(40)
sql Override.
Description
Surrogate key for the fact table
Surrogate key of the period(time) table
surrogate key of the customer dim table
surrogate key of the supplier dim table
Surrogate key of the branch table
Name of the Branch
Product code
Name of the Product
Quantity of product sold
Total amount
Load date
The workflow name
Mapping logic
Direct Move
Direct Move (Convert to date datatype)
Direct Move
Direct Move
Direct Move
Direct Move
Direct Move
Direct Move
Direct Move
Direct Move
Direct Move
Direct Move
Source Table Name Source File Column Names Source Data Type
CDW_SAPP_STG_SALES CDW_P_SALES_DSET_KEY NUMBER(9)
CDW_SAPP_STG_SALES SALES_F_PERIOD_KEY VARCHAR2(8)
CDW_SAPP_STG_SALES SALES_F_CUSTOMER_KEY NUMBER(10)
CDW_SAPP_STG_SALES SALES_F_SUPPLIER_KEY NUMBER(10)
CDW_SAPP_STG_SALES SALES_F_BRANCH_CODE NUMBER(9)
CDW_SAPP_STG_SALES SALES_F_BRANCH_NAME VARCHAR2(25)
CDW_SAPP_STG_SALES SALES_F_PRODUCT_CODE NUMBER(10)
CDW_SAPP_STG_SALES SALES_F_PRODUCT_NAME VARCHAR2(30)
CDW_SAPP_STG_SALES SALES_SOLD_QTY NUMBER(7)
CDW_SAPP_STG_SALES SALES_TOTAL_AMOUNT NUMBER(8,2)
CDW_SAPP_STG_SALES CREATED_DATE NA
CDW_SAPP_STG_SALES CREATED_BY NA
Target Table/File name Target Table/File name
CDW_SAPP_F_AGG_DATA CDW_SAPP_F_AGG_DATA
CDW_SAPP_F_AGG_DATA CDW_SAPP_F_AGG_DATA
CDW_SAPP_F_AGG_DATA CDW_SAPP_F_AGG_DATA
CDW_SAPP_F_AGG_DATA CDW_SAPP_F_AGG_DATA
CDW_SAPP_F_AGG_DATA CDW_SAPP_F_AGG_DATA
CDW_SAPP_F_AGG_DATA CDW_SAPP_F_AGG_DATA
CDW_SAPP_F_AGG_DATA CDW_SAPP_F_AGG_DATA
CDW_SAPP_F_AGG_DATA CDW_SAPP_F_AGG_DATA
sql Override.
Convert the SALES_F_PERIOD_KEY to
date and then fetch one month data
to load the aggreagte table
ch
Source extract logic :
Branch code for the current reporting period sold in each branch CDW_SAPP_F_SALES
Branch name for the corresponding branch CDW_SAPP_F_SALES
Product code for the current reporting period sold in each branch CDW_SAPP_F_SALES
Product name for the corresponding product sold CDW_SAPP_F_SALES
Sum of the total quantity of a particular product sold in each branch for the current
reporting period. CDW_SAPP_F_SALES
Sum of the price sold of a particular product in each branch for the current reporting
period. CDW_SAPP_F_SALES
Sysdate -
Source File Column Names Source Data Type
- -
NUMBER(9)
SALES_F_BRANCH_CODE
SALES_F_BRANCH_NAME VARCHAR2(25)
NUMBER(10)
SALES_F_PRODUCT_CODE
SALES_F_PRODUCT_NAME VARCHAR2(30)
NUMBER(7)
SALES_SOLD_QTY
NUMBER(8,2)
SALES_TOTAL_AMOUNT
- -
Target File name Target Field names Target DataType
NUMBER(9)
MNTH_SALES_RPT_BRANCH_FILE BRANCH_CODE
MNTH_SALES_RPT_BRANCH_FILE BRANCH_NAME STRING(25)
MNTH_SALES_RPT_BRANCH_FILE TOTAL_REVENUE NUMBER(8,2)
MNTH_SALES_RPT_BRANCH_FILE CREATED_DATE Date
Date created
Mapping logic Source Table Name
Load the top five product code which as shown highest revenue. CDW_SAPP_F_AGG_DATA
Load the product name which shown highest revenue. CDW_SAPP_F_AGG_DATA
Cumulative sum of products sold during reporting period. CDW_SAPP_F_AGG_DATA
Cumulative sum of revenues for all products sold during reporting
period. CDW_SAPP_F_AGG_DATA
Sysdate -
Source Table Column Names Source Data
Type
NUMBER(9)
SALES_F_BRANCH_CODE
SALES_F_BRANCH_NAME VARCHAR2(25)
SALES_TOTAL_AMOUNT NUMBER(8,2)
- -
PRODUCT_CODE NUMBER(10)
PRODUCT_NAME VARCHAR2(30)
TOTAL QUANTITY NUMBER(7)
NUMBER(8,2)
TOTAL AMOUNT
- -