0% found this document useful (0 votes)
8 views

L5 DataWarehousing

KGiSL Institute of Technology is located in Coimbatore, India and is approved by AICTE and affiliated with Anna University. The document discusses the unit on data warehousing from a Foundations of Data Science course. It defines data warehousing, describes its goals of supporting reporting, maintaining historical information, and informing decision making. It also covers characteristics of data warehouses like being subject-oriented, integrated, and time-variant. Finally, it explains the three-tier architecture of data warehousing systems.

Uploaded by

Vikash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

L5 DataWarehousing

KGiSL Institute of Technology is located in Coimbatore, India and is approved by AICTE and affiliated with Anna University. The document discusses the unit on data warehousing from a Foundations of Data Science course. It defines data warehousing, describes its goals of supporting reporting, maintaining historical information, and informing decision making. It also covers characteristics of data warehouses like being subject-oriented, integrated, and time-variant. Finally, it explains the three-tier architecture of data warehousing systems.

Uploaded by

Vikash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

KGiSL Institute of Technology

(Approved by AICTE, New Delhi; Affiliated to Anna University, Chennai)


Recognized by UGC, Accredited by NBA (IT)
365, KGiSL Campus, Thudiyalur Road, Saravanampatti, Coimbatore – 641035.

Department of Computer Science and Engineering

Name of the Faculty : Ms.Nithya V

Subject Name & Code : CS3352/ Foundations Of Data Science

Branch & Department : Computer Science and Engineering

Year & Semester : II / III

Academic Year :2023-24

CS8691/AI/IIICSE/VISEM/KG-KiTE
CS3352/ Foundations Of Data Science

UNIT I
INTRODUCTION
• Data Science: Benefits and uses – facets of data
• Data Science Process: Overview – Defining research goals – Retrieving
data – Data preparation - Exploratory Data analysis – build the model–
presenting findings and building applications
• Data Mining - Data Warehousing – Basic Statistical descriptions of Data
Data Warehousing
• A data warehouse is a way to store historical information from multiple
sources to allow you to analyse and report on related data (e.g., your
sales transaction data, mobile app data and CRM data).
Goals of data warehousing:

1. To help reporting as well as analysis.

2. Maintain the organization's historical information.

3. Be the foundation for decision making.


Characteristics of Data Warehousing
• Subject-oriented: Data is organized around business subjects, such as
customers, products, or orders.
• Integrated: Data from multiple sources is integrated into a single
repository.
• Time-variant: Data is stored over time, so that trends and patterns can
be analyzed.
• Non-volatile: Data is not updated in the data warehouse, so that
historical data is preserved.
• Consistent: Data is consistent across all sources.
Data warehouse architecture

● Data warehouse architecture is a data storage framework's design of an


organization. A data warehouse architecture takes information from raw
sets of data and stores it in a structured and easily digestible format.
● Data warehouse system is constructed in three ways.

a) Single-tier architecture.

b) Two-tier architecture.

c) Three-tier architecture (Multi-tier architecture).


Data warehouse architecture

● Single tier warehouse architecture: It focuses on creating a compact


data set and minimizing the amount of data stored. While it is useful
for removing redundancies. It is not effective for organizations with
large data needs and multiple streams.
● Two-tier warehouse structures: It separate the resources physically
available from the warehouse itself. This is most commonly used in small
organizations where a server is used as a data mart. While it is more
effective at storing and sorting data. Two-tier is not scalable and it
supports a minimal number of end-users.
Three-tier Architecture of Data Warehousing
Data warehouse architecture

Three tier architecture creates a more structured flow for data from raw
sets to actionable insights. It is the most widely used architecture for data
warehouse systems.It is also called as multi-tier architecture.

● The bottom tier is the database of the warehouse, where the cleansed
and transformed data is loaded. The bottom tier is a warehouse
database server.
● Metadata is data about data. It is used to describe the data in the data
warehouse, such as its structure, format, and meaning.
Characteristics of Data Warehousing
● The middle tier is the application layer giving an abstracted view of
the database.
● It arranges the data to make it more suitable for analysis. This is done
with an OLAP( Online Analytical Processing) server, implemented
using the ROLAP(Relational) or MOLAP(Multidimensional) model.
● OLAPS can interact with both relational databases and
multidimensional databases, which lets them collect data better based
on broader parameters.
Characteristics of Data Warehousing
● The top tier is the front-end of an organization's overall business
intelligence suite.
● The top-tier is where the user accesses and interacts with data via
queries, data visualizations and data analytics tools.The top tier
represents the front-end client layer.
● The client level which includes the tools and Application
Programming Interface (API) used for high-level data analysis,
inquiring and reporting.
● User can use reporting tools, query, analysis or data mining tools.
Needs of Data Warehouse

• Business user: Business users require a data warehouse to view summarized


data from the past.
• Store historical data: Data warehouse is required to store the time variable
data from the past.
• Make strategic decisions: Some strategies may be depending upon the data
in the data warehouse. So, data warehouse contributes to making strategic
decisions.
• High response time: Data warehouse has to be ready for somewhat
unexpected loads and types of queries, which demands a significant degree
of flexibility and quick response time.
Benefits of Data Warehouse

● Understand business trends and make better forecasting decisions.


● Data warehouses are designed to perform well enormous amounts of data.
● The structure of data warehouses is more accessible for end-users to
navigate, understand and query.
● Queries that would be complex in many normalized databases could be
easier to build and maintain in data warehouses.
● Data warehousing is an efficient method to manage demand for lots of
information from lots of users.
● Data warehousing provide the capabilities to analyze a large amount of
historical data.

You might also like