0% found this document useful (0 votes)
7 views

BA102 - NOTES M3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

BA102 - NOTES M3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

BA102 - Business Analytics

by: Angel Dinah G. Bedayos

BA102 – FUNDAMENTALS OF DATA WAREHOUSING MODULE 3 Reporting Needs: KIMBALL or INMON


DATA WAREHOUSE: THE CHOICE OF INMON VERSUS KIMBALL
BILL INMON RALPH KIMBALL
When it comes to data warehouse (DWH) designing, two of the If you need organization-wide if you require reporting focused
most widely discussed and explained data warehouse approaches and integrated reporting on the business process or team
are the Inmon and the Kimball methodology. For years, people have Project Deadline. Designing a normalized data model is
debated over which data warehouse approach is better and more comparatively more complex than designing a denormalized model.
effective for businesses. However, there’s still no definite answer as This makes the Inmon approach if you have less time for
both methods have their benefits and drawbacks. a time-intensive process. delivery, then opt for the
Kimball method.
In this module, we will discuss the basics of a data warehouse, its Prospective Recruitment Plan. The higher complexity of data model
characteristics, and compare the two popular data warehouse creation in the Inmon data warehouse approach requires a larger
approaches – Kimball vs. Inmon. team of professionals for data warehouse management. Therefore,
choose accordingly.
The key data warehouse concept allows users to access a unified Frequent Changes. If your Frequent Changes. However, if
version of truth for timely business decision-making, reporting, and reporting needs are likely to reporting needs and source
forecasting. DWH functions like an information system with all the change more quickly and you systems are comparatively
past and commutative data stored from one or more sources. are dealing with volatile source stable, it’s better to use the
systems, then opt for the Inmon Kimball method
CHARACTERISTICS OF A DATA WAREHOUSE method as it offers more
The following are the four characteristics of a Data Warehouse: flexibility.
Organizational Principles. If your Organizational Principles. if the
Subject-Oriented: A data warehouse uses a theme, and delivers organization’s stakeholders and decision-makers aren’t
information about a specific subject instead of a company’s current corporate directors recognize concerned about the nitty-gritty
operations. In other words, the data warehousing process is more the need for data warehousing of the approach, and are only
equipped to handle a specific theme. Examples of themes or and are ready to bear the looking for a solution to improve
subjects include sales, distributions, marketing, etc. expenses, then the Bill Inmon reporting, then it’s sufficient to
data warehouse method would opt for the Kimball data
Integrated: Integration is defined as establishing a connection be a safer bet. warehouse method.
between large amount of data from multiple databases or sources.
However, it is also essential for the data to be stored in the data PARAMETERS KIMBALL INMON
warehouse in a unified manner. The process of data warehousing – Dimensional Design – Enterprise
integrates data from multiple sources, such as a mainframe, Warehouse (CIF)
relational databases, flat files, etc. Furthermore, it helps maintain Kimball publishes “The Inmon publishes
consistent codes, attribute measures, naming conventions, and, DWH Toolkit” “Building the DWH”
formats. Approach It has a Bottom-Up It has a Top-Down
Approach for Approach for
Time-variant: Time-variant in a DW is more extensive as compared implementation implementation
to other operating systems. Data stored in a data warehouse is (Updates book and (Updates book and
recalled with a specific time period and provides information from a defines multiple defines architecture
databases called data for collection of
historical perspective.
marts that are disparate sources into
organized by business detailed, time variant
Non-volatile: In the non-volatile data warehouse, data is permanent processes, but use data store.)
i.e. when new data is inserted, previous data is not replaced, enterprise standard
omitted, or deleted. In this data warehouse, data is readonly and data bus)
only refreshes at certain intervals. The two data operations Data Integration If focused on Individual It focuses on
performed in the data warehouse are data access and data loading. business areas Enterprise-wide areas
Data Model It prefers data to be in It prefers data to be in
FUNCTIONS OF A DATA WAREHOUSE the Denormalized the Normalized model
model
Data warehouse functions as a repository. It helps organizations
Data Store Systems Source systems are Have a high rate of
avoid the cost of storage systems and backup data at an enterprise
highly stable changes
level. The prominent functions of the data warehouse are: Dimensional Models Star and Snowflake ER modeling
model techniques
Data Cleaning fixing or removing incorrect, corrupted, Development Spiral Approach Centralized approach
incorrectly formatted, duplicate, or incomplete Methodology
data within a dataset
Data Integration combining data from multiple separate business Reporting Needs: KIMBALL or INMON
systems into a single unified view, often called a
single view of the truth BILL INMON RALPH KIMBALL
Data Mapping connecting a data field from one source to a If you need organization-wide if you require reporting focused
data field in another sourc and integrated reporting on the business process or team
Data Extraction obtaining data from multiple sources, and Project Deadline. Designing a normalized data model is
moving it to a new destination designed to comparatively more complex than designing a denormalized model.
support online analytical processing This makes the Inmon approach if you have less time for
Data converting, cleansing, and structuring data into a time-intensive process. delivery, then opt for the
Transformation a usable format that can be analyzed to support Kimball method.
decision making processes Prospective Recruitment Plan. The higher complexity of data model
Data Loading copying and loading data or data sets from a creation in the Inmon data warehouse approach requires a larger
source file, folder or application to a data team of professionals for data warehouse management. Therefore,
warehouse or similar application choose accordingly.
Refreshing involves management of time differences Frequent Changes. If your Frequent Changes. However, if
between updating of data sources and updating reporting needs are likely to reporting needs and source
of the related data warehouse objects (base change more quickly and you systems are comparatively
tables, materialized views, data cubes, data are dealing with volatile source stable, it’s better to use the
marts) systems, then opt for the Inmon Kimball method
BA102 - Business Analytics
by: Angel Dinah G. Bedayos
method as it offers more
flexibility.
Organizational Principles. If your Organizational Principles. if the
organization’s stakeholders and decision-makers aren’t
corporate directors recognize concerned about the nitty-gritty
the need for data warehousing of the approach, and are only
and are ready to bear the looking for a solution to improve
expenses, then the Bill Inmon reporting, then it’s sufficient to
data warehouse method would opt for the Kimball data
be a safer bet. warehouse method.

OLAP (online analytical processing) is a software for performing


What are they saying? multidimensional analysis at high speeds on large volumes of data
These two influential data warehousing experts represent the from a data warehouse, data mart, or some other unified,
current prevailing views on data warehousing. centralized data store.
 Kimball, in 1997, stated that
 "...the data warehouse is nothing more than the union Applications:
of all the data marts", o Setup and Built are quick.
 Kimball indicates a bottom-up data warehousing o Generating report against multiple star schema is very successful.
methodology in which individual data marts providing o Database operations are very effective.
thin views into the organizational data could be created o Occupies less space in the database and management is easy.
and later combined into a larger all-encompassing data
warehouse.
 Inmon responded in 1998 by saying,
 "You can catch all the minnows in the ocean and stack
them together and they still do not make a whale,"
 This indicates the opposing view that the data
warehouse should be designed from the top-down to
include all corporate data. In this methodology, data
marts are created only after the complete data
warehouse has been created.

KIMBALL APPROACH
Kimball’s approach to designing a Dataware house was introduced
by Ralph Kimball. This approach starts with recognizing the business
process and questions that Dataware house has to answer. These
sets of information are being analyzed and then documented well.
The Extract Transform Load (ETL) software brings all data from
multiple data sources called data marts and then is loaded into a
common area called staging. Then this is transformed into an OLAP
cube.
 Once data is uploaded in the data warehouse staging area, the
next phase includes loading data into a dimensional data
warehouse model that’s denormalized by nature. This model
partitions data into the fact table, which is numeric
transactional data or dimension table, which is the reference
information that supports facts.
 Star schema is the fundamental element of the dimensional
data warehouse model. The combination of a fact table with ADVANTAGES OF THE KIMBALL APPROACH
several dimensional tables is often called the star schema.  Kimball dimensional modeling is fast to construct as no
Kimball dimensional modeling allows users to construct normalization is involved, which means swift execution of the
several star schemas to fulfill various reporting needs. The initial phase of the data warehousing design process
advantage of star schema is that small dimensionaltable  An advantage of star schema is that most data operators can
queries run instantaneously. easily comprehend it because of its denormalized structure,
 The Kimball matrix displays how star schemas are constructed. which simplifies querying and analysis
It is used by business management teams as an input to  Data warehouse system footprint is trivial because it focuses
prioritize which row of the Kimball matrix should be on individual business areas and processes rather than the
implemented first. whole enterprise. So, it takes less space in the database,
simplifying system management
 It enables fast data retrieval from the data warehouse, as data
is segregated into fact tables and dimensions. For example,
BA102 - Business Analytics
by: Angel Dinah G. Bedayos
the fact and dimension table for the insurance industry would
include policy transactions and claims transactions
 A smaller team of designers and planners is sufficient for data
warehouse management because data source systems are
stable, and the data warehouse is process-oriented. Also,
query optimization is straightforward, predictable, and
controllable
 The Kimball approach to data warehouse lifecycle is also
referred to as the business dimensional lifestyle approach
because it allows business intelligence tools to go deeper
across several star schemas and generates reliable insights

DISADVANTAGES OF THE KIMBALL APPROACH


× Data isn’t entirely integrated before reporting; the idea of a
‘single source of truth is lost
× Irregularities can occur when data is updated in Kimball DW
architecture. This is because in denormalization technique,
redundant data is added to database tables
× In the Kimball DW architecture, performance issues may occur due
to the addition of columns in the fact table, as these tables are quite ADVANTAGES OF THE INMON METHOD
in-depth. The addition of new columns can expand the fact table  Data warehouse acts as a unified source of truth for the entire
dimensions, affecting its performance. Also, the dimensional data business, where all data is integrated
warehouse model becomes difficult to alter with any change in the  This approach has very low data redundancy. So, there’s less
business needs possibility of data update irregularities, making the ETL-
× As the Kimball model is business process-oriented, instead of concept based data warehouse process more straightforward
focusing on the enterprise as a whole, it cannot handle all the BI and less susceptible to failure
reporting requirements  Since corporate data model serves as the starting point for
× The process of incorporating large amounts of legacy data into the establishing a data warehouse, each important entity is given
data warehouse is complex a thorough, logical model from this approach
 This approach offers greater flexibility, as it’s easier to update
INMON APPROACH the data warehouse in case there’s any change in the business
Bill Inmon, the father of data warehousing, came up with the requirements or source data
concept to develop a data warehouse which identifies the main  It can handle diverse enterprise-wide reporting requirements
subject areas and entities the enterprise works with, such as
customers, product, vendor, and so on. Bill Inmon’s definition of a DISADVANTAGES OF THE INMON METHOD
data warehouse is that it is a “subject-oriented, nonvolatile, × Complexity increases as multiple tables are added to the data
integrated, time-variant collection of data in support of model with time
management’s decisions.” × Resources skilled in data warehouse data modeling are required,
which can be expensive and challenging to find
This approach starts with a corporate data model. This model × The preliminary setup and delivery are time-consuming
recognizes key areas and also takes care of customers, products,
and vendors. This model serves for the creation of a detailed logical PARAMETERS KIMBALL INMON
model which is used for major operations. Details and models are – Dimensional Design – Enterprise
then used to develop a physical model. Warehouse (CIF)
 This model is normalized and makes data redundancy less. Kimball publishes “The Inmon publishes
This is a complex model that is difficult to be used for business DWH Toolkit” “Building the DWH”
purposes for which data marts are created and each Approach It has a Bottom-Up It has a Top-Down
department is able to use it for their purposes. Approach for Approach for
implementation implementation
Applications: (Updates book and (Updates book and
o The data warehouse is very flexible to changes. defines multiple defines architecture
databases called data for collection of
o Business processes can be understood very easily. o Reports can
marts that are disparate sources into
be handled across enterprises. organized by business detailed, time variant
o ETL process is very less prone to errors. processes, but use data store.)
enterprise standard
data bus)

Data Integration If focused on Individual It focuses on


business areas Enterprise-wide areas
Data Model It prefers data to be in It prefers data to be in
the Denormalized the Normalized model
model
Data Store Systems Source systems are Have a high rate of
highly stable changes
Dimensional Models Star and Snowflake ER modeling
model techniques
Development Spiral Approach Centralized approach
Methodology
Building Time It is efficient and takes It is complex and
less time consumes a lot of time
Cost It has iterative steps Initial cost is huge and
and is cost-effective the development cost
is low
Skills Required It does not need such It needs specialized
skills but a generic skills to
team will do the job make work
Maintenance maintenance is difficult maintenance is easy
BA102 - Business Analytics
by: Angel Dinah G. Bedayos
CASE EXAMPLES (Inmon or Kimball Approach?)
Delivery Time
 When you develop a data warehouse, it will take some time to
deliver the first delivery to the customer or the organization.
Suppose your company can wait a long time to develop the
data warehouse. You can choose the Inmon approach. It will
take around four to nine months.
 But if you have a very limited time frame for the delivery, you
can follow the Kimbal approach. The main reason for the time
difference is normalized data model is a bit more difficult to
create than the denormalized model.

Workforce
 If your company reduces the workforce to maintain the data
warehouse, Kimball’s approach can be used to develop the
data warehouse.
 But if you can afford a large scale of staff, then the Inmon
approach is more useful.

Requirements for the reporting


 The Inmon method is more suitable for use in large
enterprises with integrated reporting since the requirements
are strategic Inmon method is more useful.
 The Kimball technique is ideal for reporting based on a team
or business process.

Adapting with changes


 The Inmon approach performs better because it is more
adaptable if the reporting requirements are anticipated to
change more quickly and the source systems are known to be
volatile.
 The Kimball technique can be utilized if the prerequisites and
source systems are reasonably stable.

Choices with company culture


 The Inmon strategy is preferable if the sponsors of the data
warehouse and the firm’s managers are aware of its value
proposition and ready to accept long-term benefits from the
data warehouse investment.
 The Kimball technique is sufficient if the sponsors don’t care
about the concepts and only want a way to improve reporting.

You might also like