Lecture 3
Lecture 3
(IS 422)
Lecture 3
Main Data warehouse types
• ADWH is a technique for collecting and managing data from varied sources
to provide meaningful business insights. It is a blend of technologies and
components which aids the strategic use of data. (Inmon Bill)
• A dependent DM ensures that the end user is viewing the same version of
the data that is accessed by all other data warehouse users. The high cost of
data warehouses limits their use to large companies.
Data Mart
• Many firms use a lower-cost, scaled-down version of a data warehouse
referred to as an independent DM subset.
– The first, the key column, consists of a group of foreign keys (FK) that
point to the primary keys of dimensional tables that are associated with
this fact table to enable business analysis.
– The relationships between fact tables and the dimensions are one-to-
many
Dimensional modeling
• Facts
Dimensional modeling
• Facts
Dimensional modeling
• Facts
– The second type of column in a fact table is the actual measures of the
business activity such as the sales revenue and order quantity.
– Every measurement has a grain, which is the level of detail in the measurement
of an event such as a unit of measure, currency used, or ending daily balance of
an account.
– For example SalesQuantity, SalesAmount, ReturnAmount, ReturnQuantity,
DiscountAmount, DiscountQuantity, and TotalCost that apply to a customer for
a product purchased at a specific time.
– All of these measures are related to the business event (the sale) that the fact
represents and they have a level of granularity related to that event.
Dimensional modeling
• Facts
.
You choose which schema to use when building the dimensional model by
considering these questions:
1. What kind of analysis are you trying to perform on that data and how complex is
it?
2. What are the analytical requirements and restrictions?
3. How consistent is the data you want to query and analyze?
4. What BI tool do you plan to use? Although different tools may appear to show
the same type of data, results, and graphs, they can be very different under the
covers, and rely on a specific schema for the best results.
Note : most slides in this file produced from : Sharda, Ramesh, Dursun Delen, and Efraim
Turban. Business intelligence, analytics, and data science: a managerial perspective. pearson,
Schema Types
Star schema
• The star schema (sometimes referenced as star join schema) is the most
commonly used and the simplest style of dimensional modeling.
• A star schema contains a central fact table surrounded by and connected to
several dimension tables.
• The fact table contains a large number of rows that correspond to observed
facts and external links (i.e., foreign keys).
• A fact table contains the descriptive attributes needed to perform decision
analysis and query reporting, and foreign keys are used to link to
dimension tables.
Schema Types
Star schema
• The decision analysis attributes consist of performance measures,
operational metrics, aggregated measures (e.g., sales volumes, customer
retention rates, profit margins, production costs, scrap rate), and all the
other metrics needed to analyze the organization’s performance.
• In other words, the fact table primarily addresses what the data warehouse
supports for decision analysis.
• Surrounding the central fact tables (and linked via foreign keys) are
dimension tables.
Schema Types
Star schema
• The dimension tables contain classification and aggregation information
about the central fact rows.
• Dimension tables contain attributes that describe the data contained within
the fact table; they address how data will be analyzed and summarized.
• In querying, the dimensions are used to slice and dice the numerical values
in the fact table to address the requirements of an ad hoc information need.
Schema Types
Star schema
Star Schema Main Characteristics
1. Simplicity: It is the simplest type of DWH schemas.
non-voltile
updated