Chapter 2
Chapter 2
• A data warehouse target on the modeling and analysis of data for decision-
makers. Therefore, data warehouses typically provide a concise and
straightforward view around a particular subject, such as customer, product,
or sales, instead of the global organization's ongoing operations.
• Every business transaction does not update the data in the data
warehouse.
Non-volatile
Data Warehouse
Components
Overview of the components
• When we build operational system, we put several components to
make up the system.
• You can not ignore the internal data held in private files.
Reference(https://channels.theinnovationenterprise.com/articles/are-companies-using-their-external-data )
External Data
• Data from outside do not conform to your formats.
• Organize data transmission
• Conversion into internal data formats and types.
Data Staging Component
• After extracting data from various sources, you have to prepare the
data for storing in a data warehouse.
• Three major functions need to be performed:
• Extraction
• Transformation
• Loading
Data Staging Component
• Why do you need a separate place or component to perform data
preparation?
• After you extract the data, you may keep it in a separate physical
environment for further preparation.
Data Transformation
• In every system implementation , data transformation is an important
function.
• For example:
• Novice users
• Casual users
• Business analyst
• Power users etc.
• In a point-of-sale system for a grocery store, the units of sale are captured and
stored at the level of units of a product per transaction at the check-out counter.
• In an order entry system, the quantity ordered is captured and stored at the
level of units of a product per order received from the customer.
• Whenever you need summary data, you add up the individual transactions.
• For example, If you are looking for units of a product ordered this month, you
read all the orders entered for the entire month for that product and add up.
• Granularity means the level of detail of your data within the data structure. In a
typical Data Warehouse one might find very detailed data (such as
seconds, single product, one specific attribute) and aggregated data (such
as total number of, monthly orders, all products).
• The higher the granularity of a fact table the more data you will have. But the
granularity of your data also determines what kind of information you can get out of the
stored data.
• Analysis begins at a high level and moves down to lower levels of details.
Data Granularity
Exercise
• Identify an Organization whose business needs can not be fulfilled by
existing operational database systems and it require a data
warehouse solution. List down the issues, which can not be resolved
by operational databases for this particular organization and how a
data warehouse would help. Also identify required levels of
granularity (Be Precise).
Approaches for building a
data warehouse
Data Warehouses & Data Marts
• Bill Inmon in 1998 stated in one of the magazine,
• “The single most important issue facing the IT manager this year is
whether to build the data warehouse or the data mart first.”
• Individual data marts are targeted to particular business groups in the enterprise.
• The collection of all the data marts form an integrated whole, called the enterprise data
warehouse.
• A Data Mart is a condensed version of Data Warehouse and is designed for use by a
specific department, unit or set of users in an organization. E.g., Marketing, Sales, HR or
finance
Basic Data Warehouse
Architecture
One Version
Source OLTP of the Truth Subset Data Marts
Systems
Enterprise
Data
Warehouse
Differences Between a Data Warehouse and a Data Mart
• In the data mart, the data is usually transformed as part of the load
process(ETL).
T
E
Single ETL for Dependent data marts
enterprise data warehouse (EDW) loaded from EDW
Reasons for creating Data Marts
• The motivations behind the creation of these two types of data marts
are also typically different.