L5 DataWarehousing
L5 DataWarehousing
CS8691/AI/IIICSE/VISEM/KG-KiTE
CS3352/ Foundations Of Data Science
UNIT I
INTRODUCTION
• Data Science: Benefits and uses – facets of data
• Data Science Process: Overview – Defining research goals – Retrieving
data – Data preparation - Exploratory Data analysis – build the model–
presenting findings and building applications
• Data Mining - Data Warehousing – Basic Statistical descriptions of Data
Data Warehousing
• A data warehouse is a way to store historical information from multiple
sources to allow you to analyse and report on related data (e.g., your
sales transaction data, mobile app data and CRM data).
Goals of data warehousing:
a) Single-tier architecture.
b) Two-tier architecture.
Three tier architecture creates a more structured flow for data from raw
sets to actionable insights. It is the most widely used architecture for data
warehouse systems.It is also called as multi-tier architecture.
● The bottom tier is the database of the warehouse, where the cleansed
and transformed data is loaded. The bottom tier is a warehouse
database server.
● Metadata is data about data. It is used to describe the data in the data
warehouse, such as its structure, format, and meaning.
Characteristics of Data Warehousing
● The middle tier is the application layer giving an abstracted view of
the database.
● It arranges the data to make it more suitable for analysis. This is done
with an OLAP( Online Analytical Processing) server, implemented
using the ROLAP(Relational) or MOLAP(Multidimensional) model.
● OLAPS can interact with both relational databases and
multidimensional databases, which lets them collect data better based
on broader parameters.
Characteristics of Data Warehousing
● The top tier is the front-end of an organization's overall business
intelligence suite.
● The top-tier is where the user accesses and interacts with data via
queries, data visualizations and data analytics tools.The top tier
represents the front-end client layer.
● The client level which includes the tools and Application
Programming Interface (API) used for high-level data analysis,
inquiring and reporting.
● User can use reporting tools, query, analysis or data mining tools.
Needs of Data Warehouse