0% found this document useful (0 votes)
7 views

DataMiningTechniques

Data mining

Uploaded by

saikiransai9948
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

DataMiningTechniques

Data mining

Uploaded by

saikiransai9948
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/370497135

Data Mining Techniques Session Objective

Presentation · May 2023

CITATIONS READS

0 41

1 author:

Nilu Singh
Koneru Lakshmaiah Education Foundation
121 PUBLICATIONS 368 CITATIONS

SEE PROFILE

All content following this page was uploaded by Nilu Singh on 04 May 2023.

The user has requested enhancement of the downloaded file.


Data Mining Techniques

Dr. Nilu Singh


Department of Computer Science & Engineering,
Koneru Lakshmaiah Education Foundation (K.L.E.F).
(Deemed to be University estd., u/s 3 of UGC Act 1956)
Green Fields, Vaddeswaram, Guntur (AP)
Session Objective
• An ability to understand that what methods/techniques used for
data mining.

• An ability to compare and evaluate different data mining


techniques and identify appropriate data mining algorithm for
respective applications.
Topic to be Covered
• Introduction
• Data Mining Techniques
-Classification
-Clustering
-Regression
-Association Rules
-Outer detection
-Sequential Pattern
-Prediction

• Conclusion

• References
Introduction
• Data Mining the process of examining underlying and potentially useful patterns in big
chunks of source data.

OR

Data mining is the process of looking at large banks of information to generate new information.

• So, to detect patterns in data for insights relevant to specific purpose there are many data
mining techniques, organizations can use to turn raw data into actionable insights.

• These techniques/tools can incorporate statistical models, machine learning techniques, and
mathematical algorithms, such as neural networks or decision trees.
Cont.
The main purpose of data mining techniques is-

1. The creation of predictive index using the current information for predicting
future values.

2. Finding descriptive index for a better description of patterns in the present


data.
Data Mining Techniques
• How to process and make conclusion from the huge amount of data and what
are the methods use to this purpose.

➢ So, analyst use the following data mining techniques such as


Association, Classification, Clustering, Prediction, Sequential
patterns, Outer Detection and Regression.
• These techniques use for everything such as from cutting-edge artificial
Intelligence to the basics of data preparation.
Classification
• It classifies items or variables in a data set into predefined groups or classes.

• Classification linear programming, statistics, decision trees, and artificial neural


network in data mining etc.

• This technique finds its origins in machine learning.

Note: Classification is used to develop software that can be modelled in a way that it becomes
capable of classifying items in a data set into different classes.
Cont.
•In classification analysis you would apply algorithms (e.g. Logistic Regression,
Naïve Bayes, Stochastic Gradient Descent, K-Nearest Neighbor, Decision Tree,
Random Forest, Support Vector Machine) to decide how new data should be
classified.

Ex: In Email, they use certain algorithms to characterize an email as legitimate or


spam.
Cont.
• Classification is used to obtain important and relevant information about data
and metadata.
• Classification helps to classify data in different classes-

➢ Classification of Data mining frameworks as per the type of data sources mined.

➢ Classification of data mining frameworks as per the database involved.

➢ Classification of data mining frameworks as per the kind of knowledge discovered.

➢ Classification of data mining frameworks according to data mining techniques used.


Clustering
• Clustering is a division of information into groups of connected objects i.e.
identify similar data.

• This technique helps to recognize the differences and similarities between the
data.

• It is very similar to the classification, but it involves grouping chunks of data


together based on their similarities.
Cont.
• Most common Clustering Algorithm are DBSCAN, K-Means, BIRCH, Mean
shift and many more.

Example:

In A library, we can use clustering to keep books with similarities in one shelf and
then give those shelves a meaningful name. So that readers looking for books on a
particular topic can go straight to that shelf instead of moving entire library.
Regression
• Regression also known as predictive power, is used to analyze the interactions
between different variables.

• It is used to define the probability of the specific variable.

• It is used to identify the likelihood/probability of a certain variable, given the


presence of other variables.

• Regression analysis is also used to foresee the future value of a specific entity

• The Main aim of regression is to show the links between two pieces of
information in one set.
Cont.
Example:

We might use it to project certain costs, depending on other factors such as


availability, consumer demand, and competition. Primarily it gives the exact
relationship between two or more variables in the given data set.
Association Rules
• It is one of the most used data mining techniques out of all the others.

• In this technique, a transaction and the relationship between its items are used
to identify a pattern.

• It is used to conduct market basket analysis, which is done to find out all those
products that customers buy together on a regular basis.
Cont.
Example:
A list of grocery items that you have been buying for the last six months. It calculates a
percentage of items being purchased together. below are three major measurements
technique-
Lift:
This measurement technique measures the accuracy of the confidence over how often
item B is purchased.
(Confidence) / (item B)/ (Entire dataset)
Support:
This measurement technique measures how often multiple items are purchased and
compared it to the overall dataset.
(Item A + Item B) / (Entire dataset)
Confidence:
This measurement technique measures how often item B is purchased when item A is
purchased as well.
(Item A + Item B)/ (Item A)
Outer Detection (Outlier Analysis)
• This a process of identifying certain anomalies (outliers) in the data set.

• This technique may be used in various domains like intrusion, detection, fraud
detection, etc.

• Outlier detection is valuable in numerous fields like network interruption


identification, credit or debit card fraud detection, detecting outlying in wireless
sensor network data, etc.
Cont.
Example:

If your purchasers are almost exclusively male, but during one strange week in July,
there’s a huge spike in female purchasers, you’ll want to investigate the spike and see
what drove it, so you can either replicate it or better understand your audience in the
process.
Sequential Patterns
• This technique aims to use transaction data, and then identify similar trends, patterns,
and events in it over a period of time.

• Sequential pattern technique of data mining helps to discover or recognize similar


patterns in transaction data over some time.

Example:
Based on this, companies offer better deals to those clients that have an actual purchasing
history.
Prediction
• This technique predicts the relationship that exists between independent and
dependent variables as well as independent variables alone.

• Prediction used a combination of other data mining techniques such as trends,


clustering, classification, etc.

• It can be used to predict future profit depending on the sale.


Cont.
Example:
Suppose that profit and sale are dependent and independent variables, respectively. Now,
based on what the past sales data says, we can make a profit prediction of the future
using a regression curve.
Data Mining Tools
• Following 2 Data Mining Tools which widely used in organizations.
➢ R-language

➢ Python

➢ Oracle

➢ MS-Excel

➢ Weka-Java, and many more.


Conclusion
• All of these techniques can help analyze different data from different perspectives.

• These techniques can be made to work together to tackle complex problems.

• Data mining technique helps companies to get knowledge-based information.


References
• Han J & Kamber M, “Data Mining: Concepts and Techniques”, Third Edition, Elsevier,
2011.
• https://www.ibm.com/docs/en/db2/10.5?topic=SSEPGG_10.5.0/com.ibm.im.model.doc/c_lift
_in_an_association_rule.html
• https://www.upgrad.com/blog/data-mining-techniques/
• https://www.javatpoint.com/data-mining-techniques
• https://www.datasciencecentral.com/profiles/blogs/the-7-most-important-data-mining-
techniques
• https://onix-systems.com/blog/8-data-mining-techniques-you-must-learn-to-succeed-in-
business
• Available at : https://www.infogix.com/top-5-data-mining-techniques/

View publication stats

You might also like