0% found this document useful (0 votes)
75 views

Data Warehousing and Data Mining

Data warehousing combines data from multiple sources into a single database to facilitate analysis. It aims to provide businesses with analytics results from data mining, OLAP, and reporting. Data mining extracts hidden patterns from large data sets to discover useful knowledge for decision making in areas like business, engineering, medicine, and more. It involves extracting, transforming, and loading data before analyzing it to gain insights.

Uploaded by

Gourav Mandwal
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views

Data Warehousing and Data Mining

Data warehousing combines data from multiple sources into a single database to facilitate analysis. It aims to provide businesses with analytics results from data mining, OLAP, and reporting. Data mining extracts hidden patterns from large data sets to discover useful knowledge for decision making in areas like business, engineering, medicine, and more. It involves extracting, transforming, and loading data before analyzing it to gain insights.

Uploaded by

Gourav Mandwal
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

DATA WAREHOUSING AND DATA MINING

1 4/23/2019 1:36:14 PM
DATA WAREHOUSING

 Data warehousing is combining data from


multiple sources into one comprehensive and
easily manipulated database.
 The primary aim for data warehousing is to
provide businesses with analytics results from
data mining, OLAP, Scorecarding and
reporting.

2 4/23/2019 1:36:14 PM
Introduction, Definitions, and
Terminology

 W. H Inmon characterized a data warehouse


as:
– “A subject-oriented, integrated, nonvolatile,
time-variant collection of data in support of
management’s decisions.”

3 4/23/2019 1:36:14 PM
NEED FOR DATA WAREHOUSING

 Information is now considered as a key for all


the works.
 Those who gather, analyze, understand, and
act upon information are winners.
 Information have no limits, it is very hard to
collect information from various sources, so
we need an data warehouse from where we
can get all the information.
4 4/23/2019 1:36:14 PM
Applications that data warehouse
supports are:

– OLAP (Online Analytical Processing) is a term used to


describe the analysis of complex data from the data
warehouse.
– DSS (Decision Support Systems) also known as EIS
(Executive Information Systems) supports
organization’s leading decision makers for making
complex and important decisions.
– Data Mining is used for knowledge discovery, the
process of searching data for unanticipated new
knowledge.

5 4/23/2019 1:36:14 PM
Conceptual Structure of Data
Warehouse

 Data Warehouse processing involves


– Cleaning and reformatting of data
– OLAP
– Data Mining

6 4/23/2019 1:36:14 PM
Conceptual Structure of Data
Warehouse

Back Flushing
Data Warehouse

OLAP
Data
Cleaning Reformatting DSSI
Databases EIS
Metadata
Data
Mining

Other Data Inputs Updates/New Data

7 4/23/2019 1:36:14 PM
DATA WAREHOUSING INCLUDES:-

 Retrieving data

 Analyzing data

 Extracting data

 Loading data

 Transforming data

 Managing data
8 4/23/2019 1:36:14 PM
DATA WAREHOUSE ARCHITECTURE

 Data warehousing is designed to provide an


architecture that will make cooperate data
accessible and useful to users.
 There is no right or wrong architecture.
 The worthiness of the architecture can be
judge by its use, and concept behind it .
 Data Warehouses can be architected in
many different ways, depending on the
specific needs of a business.
9 4/23/2019 1:36:14 PM
Typical Data Warehousing Environment

10 4/23/2019 1:36:14 PM
 An operational data store (ODS) is basically a
database that is used for being an temporary
storage area for a datawarehouse.
 Its primary purpose is for handling data which
are progressively in use.
 Operational data store contains data which are
constantly updated through the course of the
business operations.
11 4/23/2019 1:36:14 PM
 ETL (Extract, Transform, Load) is used to copy
data from:-
 ODS to data warehouse staging area.
 Data warehouse staging area to data warehouse
.
 Data warehouse to data mart .
 ETL extracts data, transforms values of
inconsistent data, cleanses "bad" data, filters
data and loads data into a target database.
12 4/23/2019 1:36:14 PM
 The Data Warehouse Staging Area is
temporary location where data from source
systems is copied.
 It increases the speed of data warehouse
architecture.
 It is very essential since data is increasing
day by day.

13 4/23/2019 1:36:14 PM
 The purpose of the Data Warehouse is to integrate
corporate data.
 The amount of data in the Data Warehouse is
massive. Data is stored at a very deep level of
detail.
 This allows data to be grouped in unimaginable
ways.
 Data Warehouses does not contain all the data in
the organization ,It's purpose is to provide base that
are needed by the organization for strategic and
tactical decision making.
14 4/23/2019 1:36:14 PM
 ETL extract data from the Data Warehouse and
send to one or more Data Marts for use of users.
 Data marts are represented as shortcut to a data
warehouse ,to save time.
 It is just an partition of data present in data
warehouse.
 Each Data Mart can contain different
combinations of tables, columns and rows from
the Enterprise Data Warehouse.
15 4/23/2019 1:36:14 PM
REASONS FOR CREATING AN DATA MART

 Easy access to frequently needed data.


 Creates collective view by a group of users.
 Improves user response time.
 Ease of creation.
 Lower cost than implementing a full Data
warehouse

16 4/23/2019 1:36:14 PM
DATA MINING

 The non-trivial extraction of implicit,


previously unknown, and potentially useful
information from large databases.

– Extremely large datasets


– Useful knowledge that can improve
processes
– Cannot be done manually

17 4/23/2019 1:36:14 PM
Where Has it Come From ?

18 4/23/2019 1:36:14 PM
Motivation

 Databases today are huge:


– More than 1,000,000 entities/records/rows
– From 10 to 10,000 fields/attributes/variables
– Giga-bytes and tera-bytes
 Databases a growing at an unprecendented rate
 The corporate world is a cut-throat world
– Decisions must be made rapidly
– Decisions must be made with maximum knowledge

19 4/23/2019 1:36:14 PM
How does data mining work?

 Extract, transform, and load transaction data onto


the data warehouse system.
 Store and manage the data in a multidimensional
database system.
 Provide data access to business analysts and
information technology professionals.
 Analyze the data by application software.
 Present the data in a useful format, such as a graph
or table

20 4/23/2019 1:36:14 PM
DATA MINING MEASURES

 Accuracy
 Clarity
 Dirty Data
 Scalability
 Speed
 Validation

21 4/23/2019 1:36:14 PM
Typical Applications of Data Mining

22 4/23/2019 1:36:14 PM
ADVANTAGES OF DATA MINING

 Engineering and Technology


 Medical Science
 Business
 Combating Terrorism
 Games
 Research and Development

23 4/23/2019 1:36:14 PM
Engineering and Technology

 In Electrical Power Engineering


- used for condition monitoring of high
voltage electrical equipment
- vibration monitoring and analysis of
transformer on-load tap-changers
 Education
- to concentrate their knowledge

24 4/23/2019 1:36:14 PM
Medical Science

 Data mining has been widely used in area of


bioinformatics , genetics
 DNA sequences and variability in disease
susceptibility which is very important to help
improve the diagnosis, prevention and
treatment of the diseases

25 4/23/2019 1:36:14 PM
BUSINESS

 In Customer Relationship Management


applications
 It Translate data from customer to merchant
Accurately
 Distribute Business Processes
 Powerful Tool For Marketing

26 4/23/2019 1:36:14 PM
Combating terrorism

 Concept used by Interpol against terrorists


for searching their records by Multistate Anti-
Terrorism Information Exchange
 In the Secure Flight program , Computer
Assisted Passenger Pre screening System ,
Semantic Enhancement

27 4/23/2019 1:36:14 PM
Games

 for certain combinatorial games, also called


table bases (e.g. for 3x3-chess)
 It includes extraction of human-usable
strategies
 Berlekamp in dots-and-boxes and Joh Nunn
in chess endgames are notable examples

28 4/23/2019 1:36:14 PM
Research And Development

 Helps to Develop the search algorithms


 It offers huge libraries of graphing and
visualisation softwares
 The users can easily create the models
optimally

29 4/23/2019 1:36:14 PM
List of the top eight data-mining
software vendors

 Angoss Software
 Infor CRM Epiphany
 Portrait Software
 SAS
 G-Stat
 SPSS
 ThinkAnalytics
 Unica
 Viscovery

30 4/23/2019 1:36:14 PM
THANK YOU

31 4/23/2019 1:36:14 PM

You might also like