Best Practices in Data Warehouse Testing GOOD
Best Practices in Data Warehouse Testing GOOD
Agenda
Introduction to Data Warehouse Why require Best Practices in testing DWH Phases in data warehouse testing Data warehouse testing goals Types of Data Warehouse Testing Generic Challenges
www.Test2008.in 2
www.Test2008.in
www.Test2008.in
www.Test2008.in
www.Test2008.in
www.Test2008.in
Data Mocking
www.Test2008.in
Test Data
Production
Test Data
Sampling of Data
Source File
Sampling of Data
Correct YES
NO
Analysis
NO
Correct
YES
Correct YES
NO
Defect
BA Sign Off
Production Checkout
www.Test2008.in
10
IS DQ % > Emergency %
IS DQ % > Threshold %
DQ E-mail
www.Test2008.in
11
Condition 1 Mocked Data for existing row update scenario (Natural Key match condition) Change any Non Natural Key attribute in Existing data Condition 3
Condition 2
Mocked Data for new row Insert scenario (Natural Key not match condition) Change any Natural Key attribute in Existing data
Mocked Data for do not delete scenarios (Natural Key and all other Non natural key attribute match condition) Create a new record same as exiting record.
OUTPUT: New record should be inserted in the table with changed natural key combination Load dt, Update Date = Job run date
OUTPUT: Existing record should get updated with new changes. Update Date = Job run date
Defects Found Raise MQC ticket in MQC and Communicate the issue on open line conference to Version DBA
Defects Found
Defects Found
BA Testing Sign-off
www.Test2008.in
12
(This involves adds to physical tables and drops and recreates for view objects)
(This involves renames to physical objects, drops and recreates for view objects)
Physical Tables
Running Describes on for physical tables for adding columns to the tables and Renames
Views
Select 1-5 rows (pre & post version predictions) to verify that columns were added with appropriate default value and appropriate changes
Defects Found
Raise MQC ticket in MQC and Communicate the issue on open line conference to Version DBA
No Defects
BA Testing Sign-off
www.Test2008.in
13
www.Test2008.in
14
Generic Challenges
Huge data volume and complexity. Data volume and complexity of data impacts performance & productivity. Scope of testing is broad as it also involves regression of data. There are different Change Requests which have different testing flow and associated rules. The data used for predictions is different and have to be mapped with actual data, based on different business rules. Chances of one scenario, in testing being repeated, is not much frequent. Currently all types of testing including regression is done by hand.
www.Test2008.in 15
Code Implementation
Data Comparison
www.Test2008.in
16
Any Questions?
www.Test2008.in
17
www.Test2008.in
18