0% found this document useful (0 votes)
15 views31 pages

BI-CH3-DW2-1

course

Uploaded by

Ayman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views31 pages

BI-CH3-DW2-1

course

Uploaded by

Ayman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Business Intelligence and

Analytics: Systems for Decision


Support
Global Edition
(10th Edition)

Chapter 3:
Data Warehousing
Main Data Warehousing
Topics
 DW definition
 Characteristics of DW
 Data Marts
 ODS, EDW, Metadata
 DW Framework
 DW Architecture & ETL Process
 DW Development
 DW Issues

-2 © Pearson Education Limited 2014


What is a Data Warehouse?
 A physical repository where
relational data are specially
organized to provide enterprise-wide,
cleansed ‫ تنقية‬data in a standardized
format
 “The data warehouse is a collection
of integrated, subject-oriented
databases designed to support DSS
functions, where each unit of data is
-3 non-volatile ‫غير متبعثرة‬
© Pearson Education Limited 2014
A Historical Perspective to
Data Warehousing
ü Mainframe computers ü Centralized data storage ü Big Data analytics
ü Simple data entry ü Data warehousing was born ü Social media analytics
ü Routine reporting ü Inmon, Building the Data Warehouse ü Text and Web Analytics
ü Primitive database structures ü Kimball, The Data Warehouse Toolkit ü Hadoop, MapReduce, NoSQL
ü Teradata incorporated ü EDW architecture design ü In-memory, in-database

1970s 1980s 1990s 2000s 2010s

ü Mini/personal computers (PCs) ü Exponentially growing data Web data


ü Business applications for PCs ü Consolidation of DW/BI industry
ü Distributer DBMS ü Data warehouse appliances emerged
ü Relational DBMS ü Business intelligence popularized
ü Teradata ships commercial DBs ü Data mining and predictive modeling
ü Business Data Warehouse coined ü Open source software
ü SaaS, PaaS, Cloud Computing

-4 © Pearson Education Limited 2014


Characteristics of DWs
 Subject oriented ‫موجهة لموضوع معين‬
 Integrated ‫متكاملة‬
 Time-variant ‫( تعدد التوقيت‬time series)
 Nonvolatile ‫غير متبعثرة‬
 Summarized
 Not normalized ‫غير اعتيادية‬
 Metadata ‫البيانات الوصفية‬
 Web based, relational/multi-
dimensional
-5
 Client/server,
© Pearson Education Limited 2014
Data Mart
A departmental small-scale “DW”
that stores only limited/relevant data
 Dependent data mart
A subset that is created directly from a
data warehouse
 Independent data mart
A small data warehouse designed for a
strategic business unit or a department

-6 © Pearson Education Limited 2014


Other DW Components
 Operational data stores (ODS)
A type of database often used as an
interim ‫ مرحلية ومؤقتة‬area for a data
warehouse
 Oper marts - an operational data mart.
 Enterprise data warehouse (EDW)
A data warehouse for the enterprise.
 Metadata: Data about data.
In a data warehouse, metadata describe
the contents of a data warehouse and the
-7 manner ‫ونمط‬ ‫أسلوب‬of
© Pearson its2014acquisition
Education Limited
Application Case 3.1
A Better Data Plan: Well-Established
TELCOs Leverage Data Warehousing and
Analytics to Stay on Top in a Competitive
Industry
Questions for Discussion
1. What are the main challenges for TELCOs?
2. How can data warehousing and data
analytics help TELCOs in overcoming their
challenges?
3. Why do you think TELCOs are well suited to
-8
take full advantage of data analytics?
© Pearson Education Limited 2014
A Generic DW Framework
No data marts option
Data Applications
Sources (Visualization)
Access
Routine
ERP Business
ETL
Reporting
Process Data mart
(Marketing)
Select
Legacy Metadata Data/text

/ Middleware
Extract mining
Data mart
(Engineering)
Transform Enterprise
POS Data warehouse
OLAP,
Integrate
Data mart Dashboard,

API
(Finance) Web
Other Load
OLTP/wEB
Replication Data mart
(...) Custom built
External
applications
data

-9 © Pearson Education Limited 2014


Application Case 3.2
Data Warehousing Helps MultiCare
Save More Lives
Questions for Discussion
1. What do you think is the role of data

warehousing in healthcare systems?


2. How did MultiCare use data

warehousing to improve health


outcomes?
-10 © Pearson Education Limited 2014
DW Architecture
 Three-tier architecture
1. Data acquisition software (back-end)
2. The data warehouse that contains the data &
software
3. Client (front-end) software that allows users
to access and analyze data from the
warehouse
 Two-tier architecture
First two tiers in three-tier architecture is
combined into one
… sometimes there is only one tier?
-11 © Pearson Education Limited 2014
DW Architectures

3-tier
architectur
e
Tier 1: Tier 2: Tier 3:
Client workstation Application server Database server

2-tier 1-tier
Architectur
architectur
e
e ?
Tier 1: Tier 2:
Client workstation Application & database server

-12 © Pearson Education Limited 2014


Data Warehousing
Architectures
 Issues to consider when deciding
which architecture to use:
 Which database management system
(DBMS) should be used?
 Will parallel processing and/or partitioning
‫ تقسيم‬be used?
 Will data migration ‫ ترحيل‬tools be used to
load the data warehouse?
 What tools will be used to support data
retrieval and analysis?

-13 © Pearson Education Limited 2014


A Web-Based DW
Architecture

Web pages
Application
Server

Client Web
(Web browser) Internet/ Server
Intranet/
Extranet
Data
warehouse

-14 © Pearson Education Limited 2014


Ten factors that potentially affect
the architecture selection
decision

1. Information 6. Strategic view of the


interdependence ‫ترابط‬ data warehouse prior
between organizational to implementation
units 7. Compatibility ‫التوافقية‬
2. Upper management’s with existing systems
information needs 8. Perceived ‫ ادراك‬ability
3. Urgency of need for a of the in-house IT staff
data warehouse 9. Technical issues
4. Nature of end-user 10.Social/political factors
tasks
5. Constraints ‫ القيود‬on
resources
-15 © Pearson Education Limited 2014
Data Integration and the Extraction,
Transformation, and Load Process

Packaged Transient
application data source

Data
warehouse

Legacy
Extract Transform Cleanse Load
system

Data mart
Other internal
applications

-16 © Pearson Education Limited 2014


Additional DW Considerations
Hosted Data Warehouses
 Benefits:
 Requires minimal investment in infrastructure
 Frees up capacity on in-house systems
 Frees up cash flow
 Makes powerful solutions affordable
 Enables solutions that provide for growth
 Offers better quality equipment and software
 Provides faster connections
 … more in the book

-17 © Pearson Education Limited 2014


Representation of Data in DW
 Dimensional Modeling
 A retrieval-based system that supports high-
volume query access
 Star schema
 The most commonly used and the simplest
style of dimensional modeling
 Contain a fact table surrounded by and
connected to several dimension tables
 Snowflakes schema
 An extension of star schema where the
diagram resembles ‫ محاكاة‬- ‫ يشابه‬a snowflake in
shape
-18 © Pearson Education Limited 2014
Multidimensionality
The ability to organize, present, and analyze
data by several dimensions, such as sales by
region, by product, by salesperson, and by
time (four dimensions)
 Multidimensional presentation

 Dimensions: products, salespeople, market


segments, business units, geographical
locations, distribution channels, country, or
industry
 Measures: money, sales volume, head count,
inventory profit, actual versus forecast
-19
 Time: daily, weekly, monthly, quarterly, or
© Pearson Education Limited 2014
Analysis of Data in DW
 OLTP vs. OLAP…
 OLTP (online transaction processing)

Capturing and storing data from ERP, CRM,
POS, …

The main focus is on efficiency of routine tasks
 OLAP (Online analytical processing)

Converting data into information for decision
support

Data cubes, drill-down / rollup, slice & dice, …
 Requesting ad hoc ‫ متخصصة‬- ‫لغرض ما بالذات‬
reports
 Conducting statistical and other analyses
-20  Developing© Pearson
multimedia-based
Education Limited 2014 applications
OLAP vs. OLTP

-21 © Pearson Education Limited 2014


OLAP Operations
 Slice - a subset of a multidimensional
array ‫مجموعة مصفوفة مرتبة ومنسقة‬
 Dice - a slice on more than two
dimensions
 Drill Down/Up - navigating among levels of
data ranging from the most summarized
(up) to the most detailed (down)
 Roll Up - computing all of the data
relationships for one or more dimensions
 Pivot ‫ محور‬- used to change the
-22
dimensional orientation of a report or an
© Pearson Education Limited 2014
A 3-dimensional
OLAP cube with Sales volumes of

OLAP slicing
operations
a specific Product
on variable Time
and Region

Slicing
e
Operations on Ti
m

a Simple Tree- Product

Geography
Dimensional Cells are filled
Sales volumes of
with numbers
Data Cube representing
sales volumes
a specific Region
on variable Time
and Products

Sales volumes of
a specific Time on
variable Region
and Products

-23 © Pearson Education Limited 2014


DW Implementation Issues
 Identification of data sources and
governance
 Data quality planning, data model design
 ETL tool selection
 Establishment of service-level agreements
 Data transport, data conversion
 Reconciliation ‫ المالئمة‬process
 End-user support
 Political issues
 … more in the book
-24 © Pearson Education Limited 2014
Successful DW
Implementation
Things to Avoid
 Starting with the wrong sponsorship chain
 Setting expectations that you cannot meet
 Engaging in politically naïve ‫ ساذج‬behavior
 Loading the data warehouse with
information just because it is available
 Believing that data warehousing database
design is the same as transactional
database design
 Choosing a data warehouse manager who
is technology oriented rather than user
-25 oriented © Pearson Education Limited 2014
Failure Factors in DW Projects
 Lack of executive sponsorship
 Unclear business objectives
 Cultural issues being ignored
 Change management
 Unrealistic expectations ‫توقعات غير واقعية‬
 Inappropriate architecture
 Low data quality / missing information
 Loading data just because it is available

-26 © Pearson Education Limited 2014


Massive DW and Scalability
 Scalability ‫مقياسية‬
 The main issues pertaining ‫ تتعلق‬to
scalability:

The amount of data in the warehouse

How quickly the warehouse is expected to
grow

The number of concurrent ‫ المتزامنين‬users

The complexity of user queries
 Good scalability means that queries and
other data-access functions will grow
linearly ‫ خطيا‬with the size of the warehouse
-27 © Pearson Education Limited 2014
Evolution ‫ تطور‬and Data
Warehousing

-28 © Pearson Education Limited 2014


Traditional versus Active DW

-29 © Pearson Education Limited 2014


DW Administration and
Security
 Data warehouse administrator (DWA)
 DWA should…

have the knowledge of high-performance software,
hardware and networking technologies

possess solid business knowledge and insight ‫نظرة‬
‫ثاقبة‬

be familiar with the decision-making processes so as
to suitably design/maintain the data warehouse
structure

possess excellent communications skills
 Security and privacy is a pressing issue in
DW
 Safeguarding ‫ حماية‬the most valuable assets
-30
‫األصول‬ © Pearson Education Limited 2014
The Future of DW
 Sourcing…
 Web, social media, and Big Data
 Open source software
 SaaS (software as a service)
 Cloud computing
 Infrastructure…
 Columnar ‫عمودي‬
 Real-time DW
 Data warehouse appliances ‫وسائل‬
 Data management practices/technologies
 In-database & In-memory processing New DBMS
 Advanced analytics
 …
-31 © Pearson Education Limited 2014

You might also like