DataMiningTechniques
DataMiningTechniques
net/publication/370497135
CITATIONS READS
0 41
1 author:
Nilu Singh
Koneru Lakshmaiah Education Foundation
121 PUBLICATIONS 368 CITATIONS
SEE PROFILE
All content following this page was uploaded by Nilu Singh on 04 May 2023.
• Conclusion
• References
Introduction
• Data Mining the process of examining underlying and potentially useful patterns in big
chunks of source data.
OR
Data mining is the process of looking at large banks of information to generate new information.
• So, to detect patterns in data for insights relevant to specific purpose there are many data
mining techniques, organizations can use to turn raw data into actionable insights.
• These techniques/tools can incorporate statistical models, machine learning techniques, and
mathematical algorithms, such as neural networks or decision trees.
Cont.
The main purpose of data mining techniques is-
1. The creation of predictive index using the current information for predicting
future values.
Note: Classification is used to develop software that can be modelled in a way that it becomes
capable of classifying items in a data set into different classes.
Cont.
•In classification analysis you would apply algorithms (e.g. Logistic Regression,
Naïve Bayes, Stochastic Gradient Descent, K-Nearest Neighbor, Decision Tree,
Random Forest, Support Vector Machine) to decide how new data should be
classified.
➢ Classification of Data mining frameworks as per the type of data sources mined.
• This technique helps to recognize the differences and similarities between the
data.
Example:
In A library, we can use clustering to keep books with similarities in one shelf and
then give those shelves a meaningful name. So that readers looking for books on a
particular topic can go straight to that shelf instead of moving entire library.
Regression
• Regression also known as predictive power, is used to analyze the interactions
between different variables.
• Regression analysis is also used to foresee the future value of a specific entity
• The Main aim of regression is to show the links between two pieces of
information in one set.
Cont.
Example:
• In this technique, a transaction and the relationship between its items are used
to identify a pattern.
• It is used to conduct market basket analysis, which is done to find out all those
products that customers buy together on a regular basis.
Cont.
Example:
A list of grocery items that you have been buying for the last six months. It calculates a
percentage of items being purchased together. below are three major measurements
technique-
Lift:
This measurement technique measures the accuracy of the confidence over how often
item B is purchased.
(Confidence) / (item B)/ (Entire dataset)
Support:
This measurement technique measures how often multiple items are purchased and
compared it to the overall dataset.
(Item A + Item B) / (Entire dataset)
Confidence:
This measurement technique measures how often item B is purchased when item A is
purchased as well.
(Item A + Item B)/ (Item A)
Outer Detection (Outlier Analysis)
• This a process of identifying certain anomalies (outliers) in the data set.
• This technique may be used in various domains like intrusion, detection, fraud
detection, etc.
If your purchasers are almost exclusively male, but during one strange week in July,
there’s a huge spike in female purchasers, you’ll want to investigate the spike and see
what drove it, so you can either replicate it or better understand your audience in the
process.
Sequential Patterns
• This technique aims to use transaction data, and then identify similar trends, patterns,
and events in it over a period of time.
Example:
Based on this, companies offer better deals to those clients that have an actual purchasing
history.
Prediction
• This technique predicts the relationship that exists between independent and
dependent variables as well as independent variables alone.
➢ Python
➢ Oracle
➢ MS-Excel