0% found this document useful (0 votes)
23 views17 pages

Overview of Data Ware Housing

A data warehouse centralizes and consolidates large amounts of data from various sources to support business intelligence activities, enabling better decision-making and improved data quality. It consists of components like external sources, staging areas, data warehouses, data marts, and data mining, and can be constructed using top-down or bottom-up approaches. The architecture typically follows a three-tier model, comprising data sources, an OLAP engine, and front-end tools, facilitating efficient data handling and analysis.

Uploaded by

sivanisri1319
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views17 pages

Overview of Data Ware Housing

A data warehouse centralizes and consolidates large amounts of data from various sources to support business intelligence activities, enabling better decision-making and improved data quality. It consists of components like external sources, staging areas, data warehouses, data marts, and data mining, and can be constructed using top-down or bottom-up approaches. The architecture typically follows a three-tier model, comprising data sources, an OLAP engine, and front-end tools, facilitating efficient data handling and analysis.

Uploaded by

sivanisri1319
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

2.

Over view of Data Ware


Housing
2.1.define data Ware housing
A data warehouse is a system that centralizes and consolidates large
amounts of data from various sources to support business intelligence (BI)
activities, especially analytics and reporting. It acts as a single source of
truth for an organization, holding both current and historical data, and is
designed to facilitate complex queries and analysis.
2.2.State the importance of Data Ware housing
Data warehousing is crucial for businesses as it enables them to consolidate,
analyze, and utilize vast amounts of data from various sources to make informed decisions and
improve overall business performance.

 Centralized Data Storage:


A data warehouse stores all the important data from different sources in one place. This
makes data easy to access and manage.

 Better Decision Making:


It helps managers and analysts make better decisions by providing accurate and organized
data.

 Improves Data Quality and Consistency:


It removes duplicate and incorrect data, making the information more reliable.

 Faster Query Performance:


Data warehouses are designed for quick searching and reporting, so users get faster results.

 Historical Data Analysis:


It stores past data, which helps in comparing current and previous performance over time.

 Supports Business Intelligence Tools:


Data warehouses work with BI tools to generate charts, graphs, and reports for business
analysis.

 Time-saving for Reporting:


Users don’t need to collect data manually from different places; everything is already
available in the warehouse.

 Helps in Trend Analysis and Forecasting:


Businesses can identify trends and predict future outcomes by analyzing stored data.
2.3.Differences between Database and data
warehouse

Feature Database Data Warehouse


Stores current data for day-to-dayStores historical data for analysis
Purpose
operations and reporting
Transactional data (e.g., sales, Analytical data (e.g., trends,
Data Type
payments) summaries)
Used by clerks, database admins, Used by data analysts and decision-
Users
and developers makers
Data is updated periodically (e.g.,
Data Update Frequently updated with real-time data
daily, weekly)
Optimized for read/write operations Optimized for read-heavy operations
Design
(OLTP) (OLAP)
Highly normalized (to avoid Often denormalized (to improve
Normalization
redundancy) query speed)
Amazon Redshift, Google BigQuery,
Examples MySQL, Oracle DB, PostgreSQL
Snowflake

2.4. Explain the Data warehouse architecture

Data Warehouse Architecture

A Data Warehouse is a system that combines data from multiple sources, organizes it under a
single architecture, and helps organizations make better decisions. It simplifies data handling,
storage, and reporting, making analysis more efficient. Data Warehouse Architecture uses a
structured framework to manage and store data effectively.

There are two common approaches to constructing a data warehouse:


1. Top-Down Approach: This method starts with designing the overall data warehouse architecture
first and then creating individual data marts.
2. Bottom-Up Approach: In this method, data marts are built first to meet specific business needs, and
later integrated into a central data warehouse.

Before diving deep into these approaches, we will first discuss the components of data
warehouse architecture.

Components of Data Warehouse Architecture

A data warehouse architecture consists of several key components that work together to store,
manage, and analyze data.
 External Sources: External sources are where data originates. These sources provide a
variety of data types, such as:
 Structured data (databases, spreadsheets)
 Semi-structured data (XML, JSON)
 Unstructured data (emails, images)

 Staging Area: The staging area is a temporary space where raw data from external sources is
validated and prepared before entering the data warehouse. This process ensures that the data
is consistent and usable. To handle this preparation effectively, ETL (Extract, Transform,
Load) tools are used:
 Extract (E): Pulls raw data from external sources.
 Transform (T): Converts raw data into a standard, uniform format.
 Load (L): Loads the transformed data into the data warehouse for further processing.

 Data Warehouse: The data warehouse acts as the central repository for storing cleansed and
organized data. It contains metadata and raw data. The data warehouse serves as the
foundation for advanced analysis, reporting, and decision-making.

 Data Marts: A data mart is a subset of a data warehouse that stores data for a specific team
or purpose, like sales or marketing. It helps users quickly access the information they need
for their work.

 Data Mining: Data mining is the process of analyzing large datasets stored in the data
warehouse to uncover meaningful patterns, trends, and insights. The insights gained can
support decision-making, identify hidden opportunities, and improve operational efficiency.

Top-Down Approach

The Top-Down Approach, introduced by Bill Inmon, is a method for designing data
warehouses that starts by building a centralized, company-wide data warehouse. This central
repository acts as the single source of truth for managing and analyzing data across the
organization. It ensures data consistency and provides a strong foundation for decision-
making.
Working of Top-Down Approach

1. Central Data Warehouse: The process begins with creating a comprehensive data
warehouse where data from various sources is collected, integrated, and stored. This involves
the ETL (Extract, Transform, Load) process to clean and transform the data.

2. Specialized Data Marts: Once the central warehouse is established, smaller, department-
specific data marts (e.g., for finance or marketing) are built. These data marts pull
information from the main data warehouse, ensuring consistency across departments.

This structured approach allows organizations to maintain a high level of data integrity and
provides a robust framework for data analysis and reporting.
Bottom-Up Approach
The Bottom-Up Approach, popularized by Ralph Kimball, takes a more flexible and
incremental path to designing data warehouses. Instead of starting with a central data
warehouse, it begins by building small, department-specific data marts that cater to the
immediate needs of individual teams, such as sales or finance. These data marts are later
integrated to form a larger, unified data warehouse.

Fig(B):Bottom up approach
Working of bottom up approach

1. Department-Specific Data Marts: The process starts with creating data marts for individual
departments or specific business functions. These data marts are designed to meet immediate
data analysis and reporting needs, allowing departments to gain quick insights.
2. Integration into a Data Warehouse: Over time, these data marts are connected and
consolidated to create a unified data warehouse. The integration ensures consistency and
provides a comprehensive view of the organization’s data.
✅ Advantages of Data Warehouse:

1. Improved Decision Making


o Helps managers make better decisions by providing accurate, consistent data.
2. Faster Query Performance
o Large amounts of data can be searched and analyzed quickly.
3. Data Integration
o Combines data from different sources (e.g., sales, marketing, finance) into one
place.
4. Historical Analysis
o Stores historical data, useful for trend analysis over time.
5. High Data Quality
o Data is cleaned, organized, and validated before storage.
6. Support for Business Intelligence (BI)
o Useful for reporting, dashboards, data mining, and analytics.

❌ Disadvantages of Data Warehouse:

1. High Cost
o Expensive to build and maintain (hardware, software, skilled staff).
2. Complex Implementation
o Needs careful planning and time to implement properly.
3. Data Loading Time
o Loading huge amounts of data (ETL process) can be time-consuming.
4. Not Suitable for Real-Time Data
o Usually works with batch data, not real-time updates.
5. Difficult to Change
o Once built, making changes or adding new data sources is complex.

2.4.Expalin three-tier Data Warehouse


Architecture
Data warehousing is essential for businesses looking to make informed
decisions based on large amounts of information. The architecture of a
data warehouse is key to its effectiveness, influencing how easily data can
be accessed and used. The Three/Multi-Tier Data Warehouse
Architecture is widely adopted due to its clear and organized framework.
This architecture divides data handling into three main layers:
 Bottom Tier (Data Sources and Data Storage)
 Middle Tier (OLAP Engine)
 Top Tier (Front-End Tools)
Three/Multi-tier Architecture of Data Warehouse

Bottom Tier

The Bottom Tier serves as the foundation of the data warehouse architecture. It is
primarily responsible for data collection and storage. This tier typically utilizes a
warehouse database server, often an RDBMS (Relational Database Management
System), to house data extracted from various operational and external sources. The
core activities in this tier involve the ETL process, which stands for Extract,
Transform, and Load.

ETL Process

The ETL process is vital for ensuring that data is clean, consistent, and optimized for
quick retrieval. The steps involved are:
1. Extract: Data is gathered from various sources, including relational databases, flat files, and
web services.
2. Transform: Data is cleaned and transformed to align with business logic and analysis
needs.
3. Load: The transformed data is loaded into a structured repository, often housed in an
RDBMS or a multidimensional database system.

Common Challenges in Bottom Tier


 Data Quality: Inconsistent data can lead to errors and unreliable analytics.
 Data Compatibility: Different data formats and structures complicate integration.
 Scalability: Efficiently handling increasing volumes of data.

Solutions
 Implement Robust ETL Tools: Utilize powerful ETL tools like Informatica, Microsoft SSIS,
or Confluent.
 Standardize Data Formats: Standardizing data at the point of entry minimizes compatibility
issues.
 Continuous Data Quality Management: Regularly check and clean data to maintain high
quality.
 Scalability Planning: Design data storage solutions that can expand as data volume grows.

Middle Tier

The Middle Tier is where the OLAP (Online Analytical Processing) server resides.
This tier acts as the processing layer that manages and enables complex analytical
queries on data stored in the bottom tier. It serves as a mediator between the data
repository and the end-user interface.

OLAP Models

OLAP technology is designed for high-speed analytical processing and comes in


three categories:
 ROLAP (Relational OLAP): Uses a relational database to manage warehouse data, ideal
for large data volumes.
 MOLAP (Multidimensional OLAP): Stores data in a multidimensional cube, making it
efficient for complex analytical queries.
 HOLAP (Hybrid OLAP): Combines relational and multidimensional processing paradigms.

Common Challenges in Middle Tier


 Data Latency: Delays in data availability can impact decision-making.
 Query Performance: Large data volumes can slow down query performance.
 Data Integration: Combining data from different sources with varying formats can be
challenging.

Solutions
 Real-Time Data Processing: Implement real-time processing and incremental loading
techniques to reduce data latency.
 Query Optimization Techniques: Utilize indexing and partitioning to improve query
performance.
 Standardization and Advanced Integration Tools: Standardize data formats and
implement robust middleware solutions.

Top Tier

The Top Tier comprises the front-end client layer, essential for interacting with the
data stored and processed in the lower tiers. This layer includes various business
intelligence (BI) tools designed to facilitate easy access and manipulation of data for
reporting, analysis, and decision-making.
Popular BI Tools
 IBM Cognos: Comprehensive reporting capabilities.
 Microsoft BI Platform: Integrates well with existing Microsoft products.
 SAP BW: Manages large datasets and integrates with other SAP products.
 Crystal Reports: Known for powerful reporting features.
 SAS Business Intelligence: Provides advanced analytics.
 Pentaho: Versatile tool for data integration and visualization.

Common Challenges in Top Tier


 Usability Issues: Complex tools can hinder user adoption.
 Integration Difficulties: Ensuring seamless integration with other tiers can be challenging.

Solutions
 User Training and Support: Offer comprehensive training sessions for users.
 Choosing Integrative Tools: Select tools that easily integrate with existing systems.

2.6.State the imporatances of Operational Data


stores
An Operational Data Store (ODS) is crucial for businesses because
it provides a real-time, consolidated view of operational data, enabling faster, more informed
decision-making.

1. Real-Time Information:
o ODS gives current (up-to-date) data for daily business activities.
2. Quick Decision Making:
o Managers and employees can take fast decisions using the latest data.
3. Combines Data from Different Places:
o It collects data from many systems (like billing, sales, customer service) into
one place.
4. Improves Work Efficiency:
o Staff don’t need to check many systems—ODS gives a single view of
everything.
5. Reduces Pressure on Main Systems:
o It handles reporting and queries, so the main systems stay fast and free for
daily work.
6. Prepares Data for Analysis:
o It helps clean and organize data before sending it to the data warehouse for
deeper analysis.
2.7.Define ETL and ELT
1. ETL – Extract, Transform, Load

Definition:
ETL is a data integration process where data is first Extracted from source systems, then
Transformed (cleaned, formatted), and finally Loaded into a data warehouse or database.

Steps:

1. Extract – Get data from various sources (e.g., databases, files).


2. Transform – Clean, format, and apply business rules to the data.
3. Load – Store the transformed data into the target system (usually a data warehouse).

Used When:

 Data needs heavy transformation before loading


 Traditional data warehouses are used

2. ELT – Extract, Load, Transform

Definition:
ELT is a variation where data is first Extracted, then Loaded directly into the target system,
and then Transformed inside the system (e.g., using SQL in a data warehouse or data lake).

Steps:

1. Extract – Pull data from source systems.


2. Load – Directly load raw data into the target system.
3. Transform – Perform transformation within the target (like a cloud warehouse).

Used When:

 The target system is powerful (e.g., cloud-based like Snowflake, BigQuery)


 Suitable for big data and flexible processing

2.8.List Types of Data warehouses


three main types of data warehouses is clear and informative.

1. Enterprise Data Warehouse (EDW):

 A centralized repository for the entire organization.


 Stores all historical data from various departments.
 Supports decision-making across the whole enterprise.
 Example: A large company consolidating data from sales, HR, finance, etc., into one
centralized location.
2. Operational Data Store (ODS):
 Designed for real-time or near real-time reporting.

 Contains current, up-to-date data rather than historical data.


 Supports daily operations and immediate decision-making, not long-term analysis.
 Example: A bank utilizing an ODS to display real-time customer balances.
3. Data Mart:
 Concentrates on a specific department or business unit, such as sales or HR.

 Easier to manage and provides faster access to relevant data.


 Example: A retail company maintaining a separate data mart specifically for sales
analysis.

2.9.Explain Data Ware Housing Model


A data warehousing model refers to the structured design and organization
of data within a data warehouse to facilitate efficient storage, retrieval, and
analysis for business intelligence and decision-making. It aims to integrate
data from various sources into a cohesive and consistent framework,
optimizing it for analytical queries rather than transactional processing.

Key Components of Data Warehousing Models:


 Enterprise Data Warehouse (EDW):
This is a centralized, comprehensive data repository that integrates data from
across the entire organization. It aims to provide a unified, holistic view of business
operations and typically contains highly detailed, historical data.
 Data Marts:
These are smaller, subject-oriented subsets of an EDW, designed to serve the
specific analytical needs of a particular department or business function (e.g.,
sales, marketing, finance).
 Dependent Data Marts: Built by extracting data from an existing EDW.
 Independent Data Marts: Created directly from operational data sources, without
relying on a central EDW.
 Virtual Data Warehouse:
This is not a physical repository but rather a logical view of operational data,
providing a quick overview without the need for extensive data extraction and
loading. It uses middleware to integrate data from disparate sources on demand.
Common Data Modeling Techniques:
 Dimensional Modeling (Kimball Methodology):
This is a widely adopted approach for data warehousing, focusing on creating a
user-friendly, performance-optimized structure for analytical queries.
 Fact Tables: Store quantitative measures (metrics) and foreign keys to dimension
tables.
 Dimension Tables: Store descriptive attributes related to the facts (e.g., time,
product, customer).
 Star Schema: A simple dimensional model where a central fact table is directly
connected to multiple dimension tables.
 Snowflake Schema: An extension of the star schema where dimension tables are
further normalized into sub-dimensions.
 Normalized Modeling (Inmon Methodology):
This approach emphasizes data integrity and minimizes redundancy by adhering to
database normalization rules (e.g., 3NF). It creates a highly normalized structure,
similar to a transactional database, but optimized for reporting.
 Data Vault Modeling:
A more recent approach designed for agile data warehouse development,
emphasizing scalability and integration. It organizes data into:
 Hubs: Represent core business entities.
 Links: Represent relationships between hubs.
 Satellites: Store descriptive attributes about hubs or links, allowing for historical
tracking of changes.
The choice of data warehousing model and modeling technique depends
on factors such as organizational size, data volume, analytical
requirements, and development methodology.

2.10 explain data ware house design approaches

Data Warehouse Design

The process of designing the structure, schema, and flow of data in a data
warehouse to support efficient data analysis and reporting. It involves organizing
data, modeling relationships, and integrating data from multiple sources.

Main Data Warehouse Design Approaches

There are three major approaches to data warehouse design:

1. Top-Down Approach (Inmon’s Approach)


 Definition: Begins with a centralized enterprise data warehouse (EDW) from which data
marts are created.
 Steps:
 Build a normalized EDW (3NF).
 Create data marts using dimensional modeling.
 Load data into EDW via ETL processes.
 Use data marts for analysis/reporting.
 Characteristics: Central repository, integrated data, normalized EDW.
 Advantages: Consistent, integrated data; good for long-term strategy; avoids redundancy.
 Disadvantages: High initial cost and time; complex implementation; longer time to show
results.

2. Bottom-Up Approach (Kimball’s Approach)

 Definition: Starts with data marts for individual departments, which are later integrated into
a data warehouse.
 Steps:
 Identify business processes (e.g., sales, orders).
 Create dimensional models (star/snowflake schema).
 Build data marts for each process.
 Integrate data marts using a bus architecture.
 Characteristics: Fast and incremental; uses denormalized models; focuses on user needs.
 Advantages: Quick results; early ROI; easy to understand and maintain; lower initial cost.
 Disadvantages: Risk of data silos; integration challenges; potential redundancy.

3. Hybrid Approach

 Definition: Combines both top-down and bottom-up methods, starting with important data
marts while planning for an enterprise-wide structure.
 Characteristics: Uses dimensional modeling with an enterprise vision; balances speed and
consistency.
 Advantages: Quick implementation; maintains consistency and integration; scalable and
flexible.
 Disadvantages: Complex design and management; requires careful planning and
governance.

Design Modeling Techniques


 Dimensional Modeling (used in Kimball's method): Includes Star Schema and Snowflake
Schema.
 ER Modeling (Entity-Relationship) (used in Inmon’s method): Focuses on a normalized
structure.
2.11 DEFINE TERMS META DATA,DATA MART
META DATA
Metadata is data about data, providing information about other data, such
as its structure, content, and usage
 Metadata helps in understanding, managing, and utilizing data effectively.
 It can be about various aspects of data, including its creation, modification, and
purpose.
 Metadata helps in finding, organizing, and categorizing data.
 Examples include file names, creation dates, author information, and data types.
 In data warehousing, metadata acts as a roadmap, helping users understand the
structure and content of the data warehouse.

DATA MART
Data mart, on the other hand, is a subset of a data warehouse focused on
a specific subject or business line, designed for efficient analysis and
reporting within that area.
 A data mart is a focused, subject-oriented database derived from a larger data warehouse.
 It contains a specific set of data relevant to a particular business unit or function.
 Data marts are designed for faster and more efficient analysis of specific business areas.
 For instance, a marketing department might have a data mart containing customer data,
campaign results, and marketing metrics.
 They are smaller and more manageable than a full data warehouse, making them easier to
implement and maintain.

2.12 DEFINE OLAP

OLAP Overview
 Purpose: OLAP enables users to analyze large datasets from multiple perspectives,
primarily for business intelligence and decision support.
 Data Model: Utilizes a multidimensional data model, often represented as an "OLAP cube,"
to facilitate efficient data analysis.

Key Features
1. Multidimensional Analysis: Users can analyze data across various dimensions (e.g., time,
product, region) simultaneously.
2. Data Cubes: The OLAP cube organizes data in a multidimensional structure, allowing for
operations like "drilling down" (viewing detailed data) and "rolling up" (viewing summarized
data).
3. Speed and Efficiency: Designed for fast query processing, enabling quick retrieval and
analysis of large datasets.
4. Business Intelligence: Integral to BI systems, providing tools for reporting, analysis, and
informed decision-making.

Comparison with OLTP


 OLAP vs. OLTP: OLAP focuses on historical data analysis and trend identification, while
OLTP (Online Transaction Processing) is centered on real-time transaction processing (e.g.,
recording sales).

Applications
 Commonly used in financial reporting, sales forecasting, budgeting, and market analysis.

2.13 LIST THE CHARACTERSTICS OF OLAP


Online Analytical Processing (OLAP) systems are designed specifically for analytical
purposes and possess several distinct characteristics that set them apart from
transactional systems. Here are the key features:

1. Multidimensional Data Analysis:


OLAP systems organize data into multidimensional structures, often referred to as
"cubes." This allows users to analyze data across various dimensions such as time,
product, geography, and customer. Users can view data from multiple perspectives
and perform complex analytical queries, including drill-down, roll-up, slice, dice, and
pivot operations.

2. High Performance for Analytical Queries:


OLAP systems are optimized for the fast retrieval and aggregation of large volumes
of data, providing quick responses to complex analytical queries. This performance
is achieved through pre-calculated aggregations and optimized data storage
structures.

3. Support for Complex Calculations and Business Logic:


OLAP systems facilitate the execution of complex calculations, aggregations, and
business rules. This capability enables users to derive insights and perform activities
such as forecasting, budgeting, and planning.

4. Historical Data Analysis:


OLAP systems are designed to store and analyze historical data, allowing users to
identify trends, patterns, and anomalies over time. This feature is crucial for strategic
decision-making.
5. User -Friendly Interface:
OLAP tools typically provide intuitive and interactive interfaces, often integrated with
popular business intelligence tools like dashboards and reporting applications. This
accessibility makes them usable for business users without extensive technical
knowledge.

6. Client/Server Architecture:
OLAP systems often employ a client/server architecture. In this setup, the OLAP
server processes and stores the multidimensional data, while client applications
provide the user interface for data interaction and analysis.

7. Data Integration from Multiple Sources:


OLAP systems can integrate data from various operational systems and data
sources, consolidating it into a unified view for comprehensive analysis.

2.14 DIFFERNTIATE BETWEEN OLTP AND OLAP


Criteria OLAP OLTP
OLAP helps you analyze large volumes OLTP helps you manage and process
Purpose
of data to support decision-making. real-time transactions.
OLAP uses historical and aggregated OLTP uses real-time and transactional
Data source
data from multiple sources. data from a single source.
OLAP uses multidimensional (cubes) or
Data structure OLTP uses relational databases.
relational databases.
OLAP uses star schema, snowflake OLTP uses normalized or
Data model
schema, or other analytical models. denormalized models.
OLTP has comparatively smaller
OLAP has large storage requirements.
Volume of data storage requirements. Think gigabytes
Think terabytes (TB) and petabytes (PB).
(GB).
OLAP has longer response times, OLTP has shorter response times,
Response time
typically in seconds or minutes. typically in milliseconds
OLAP is good for analyzing trends, OLTP is good for processing
Example
predicting customer behavior, and payments, customer data
applications
identifying profitability. management, and order processing.

2.15 LIST THE TYPES OF OLAP

There are three primary types of OLAP (Online Analytical Processing)


systems: MOLAP (Multidimensional OLAP), ROLAP (Relational OLAP),
and HOLAP (Hybrid OLAP). Each type offers different approaches to
storing and analyzing data, with varying strengths and weaknesses.

1. MOLAP (Multidimensional OLAP):


 Data Storage: MOLAP stores data in a multidimensional cube structure, optimized for fast
data analysis.

 Performance: It excels in speed and efficiency for complex queries due to pre-calculated
aggregations.
 Scalability: MOLAP can struggle with very large datasets.

 Examples: Microsoft Analysis Services, IBM Cognos TM1, and Essbase.


2. ROLAP (Relational OLAP):
 Data Storage: ROLAP stores data in relational databases, leveraging SQL for analysis.

 Performance: ROLAP can handle large volumes of data but may be slower than MOLAP for
complex queries.

 Scalability: ROLAP scales well with large datasets and is suitable for complex queries
against massive datasets.

 Examples: Uses existing relational database systems.


3. HOLAP (Hybrid OLAP):
 Data Storage: HOLAP combines the strengths of both MOLAP and ROLAP, storing some
data in a multidimensional structure and other data in a relational database.

 Performance: HOLAP offers a balance between speed and scalability.

 Scalability: HOLAP is designed to handle large volumes of data while also providing faster
query performance for specific aggregated data.

 Examples: A combination of MOLAP and ROLAP approaches.

2.16 DIFFERENTIATE BETWEEN DATA MINING


AND DATA WAREHOUSE

Aspect Data Warehousing Data Mining

Data mining focuses on


Data warehousing focuses on analyzing data to uncover
storing and organizing large patterns, trends, and
Purpose volumes of structured data. insights.

It involves collecting, It involves using algorithms


cleaning, and integrating data and techniques to analyze
Process from multiple sources. and interpret the data.

It helps predict outcomes


It supports reporting, and make data-driven
Functiona querying, and data analysis decisions based on the
lity tools for business intelligence. insights.

Stores historical and current Works on stored data to


Data data in a structured and identify relationships and
State consistent format. predict future patterns.
Used by data scientists and
Used by data analysts and decision-makers for
business intelligence teams predictive and descriptive
Users for reporting and monitoring. insights.

Generates actionable
Provides organized data for insights from analyzing the
Output easier access and use. data.

You might also like