0% found this document useful (0 votes)
33 views17 pages

Unit 4

This document provides an overview of data mining. It defines data mining as the process of extracting interesting patterns or knowledge from large amounts of data. It describes the key data mining techniques of classification, clustering, regression, and association rule mining. It also outlines the overall knowledge discovery process, including data cleaning, integration, selection, transformation, mining, evaluation, and presentation. Finally, it discusses applications of data mining such as market basket analysis, bioinformatics, education, and customer relationship management.

Uploaded by

Aleem Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views17 pages

Unit 4

This document provides an overview of data mining. It defines data mining as the process of extracting interesting patterns or knowledge from large amounts of data. It describes the key data mining techniques of classification, clustering, regression, and association rule mining. It also outlines the overall knowledge discovery process, including data cleaning, integration, selection, transformation, mining, evaluation, and presentation. Finally, it discusses applications of data mining such as market basket analysis, bioinformatics, education, and customer relationship management.

Uploaded by

Aleem Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

 What is Data Mining?

 Why Data Mining?


 What is KDD Process?
 On What Kind of Data?
 Data Mining Techniques
 Data Mining Query Language
 Applications of Data Mining
Introduction
 There is a huge amount of data available in the
Information Industry. This data is of no use until it
is converted into useful information. It is necessary
to analyze this huge amount of data and extract
useful information from it.
 Extraction of information is not only the single process,
data mining also involves other processes such as Data
Cleaning, Data Integration, Data Transformation, Data
Mining, Pattern Evaluation and Data Presentation.
 Once all these processes are over, we would be able to use
this information in many applications such as Fraud
detection, Market analysis, Science exploration, etc.
What is Data Mining?
Extraction of interesting
Patterns or Knowledge
from huge amount of
data
(Knowledge Discovery
from Data)
 One of the Step from
KDD
process
Why Data Mining?
The Explosive Growth of Data: from
terabytes to petabytes
 We are drowning in data, but
starving for
knowledge!
Fraud detection and detection of unusual
patterns
What is KDD process?
 Data cleaning
 to remove noise and inconsistent data
 Data integration
 where multiple data sources may be combined
 Data selection
 Related Data
 Data transformation
 Unified format
 Data mining
 ExtractPatterns
Pattern evaluation
to identify the truly interesting patterns
representing knowledge
Knowledge presentation
 Present the mined knowledge to the
user
On What kind of Data?
 Relational Databases
 Collection of tables
 Data Warehouses
 Data from different sources
 Transactional Databases
Consists of a file where each record represent
transactions
 Advanced Data & Applications
 Multimedia, Spatial data and WWW
Data Mining Techniques
Classification
Clustering
Regression
Association Rules
Classification
Classification is the process of predicting the class
of a new item.
Therefore to classify the new item and identify to
which class it belongs
Clustering
 Group Data into Clusters
 Similardata is grouped in the same cluster
 Dissimilar data is grouped in the same cluster
Regression
 “Regression deals with the
prediction of a value, rather
than a class.”
 Regression is a data mining
function that predicts a number
 For example, a regression
model could be used to predict
children's height, given their
age, weight, and other factors.
Association Rules
“An association algorithm creates
rules that describe how often events
have occurred together.”

Example: When a customer buys a


Computer, then 90% of the time
they will buy softwares.
Data Mining Query Language
 A DMQL can provide the ability to support interactive

data mining.

 Adopts SQL-like syntax

 Hence, can be easily integrated with relational query

languages
Applications of Data
Mining
 Market Basket Analysis

 Market basket analysis is a modeling technique based upon a theory


that if you buy a certain group of items you are more likely to buy
another group of items.
 This information may help the retailer to know the buyer’s needs and
retailer can enhance the store’s layout

 Bio Informatics
 Mining biological data helps to extract useful knowledge from
massive datasets gathered in biology, and in other related life sciences
areas
 Applications of data mining to bioinformatics include
gene finding, protein function inference, disease diagnosis,
disease treatment
 Education
 Data mining can be used by an institution to take accurate
decisions and also to predict the results of the student.
 Learning pattern of the students can be captured and used
to develop techniques to teach them.
 Customer Relationships Management (CRM)
 To maintain a proper relationship with a customer a business
need to collect data and analyze the information.
 With data mining technologies the collected data can
be used for analysis.
Thank You…

You might also like