0% found this document useful (0 votes)

29 views

Chapter 3-IB

Data mining is defined as the nontrivial process of identifying valid, novel, potentially useful, and understandable patterns in data. It involves iterative steps to build mathematical models that identify patterns among attributes in data to generate explanatory or predictive patterns. These patterns include association, prediction, clustering and sequential relationships. Data mining has applications in customer relationship management, banking, manufacturing, medicine, entertainment and many other industries. It allows organizations to gain business intelligence from their data.

Uploaded by

نجود السيف

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views

Chapter 3-IB

Uploaded by

نجود السيف

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 69

CHAPTER 3: DATA MINING FOR BUSINESS

INTELLIGENCE
Definition of Data Mining

• data mining is "the nontrivial process of identifying valid, novel,

potentially useful, and ultimately understandable patterns in data
stored in structured databases"

• Other names: knowledge extraction, pattern analysis, knowledge

discovery, information harvesting, pattern searching, data
dredging.
Definition of Data Mining

• In this definition, the meanings of the key terms are as

follows:

• Process implies that data mining comprises many iterative steps.

• Nontrivial means that some experimentation-type search or

inference is involved ; that is, it is not as straightforward as a
computation of predefined quantities.

• Valid means that the discovered patterns should hold true on new
data with sufficient degree of certainty.
Definition of Data Mining
• Novel means that the patterns are not previously known to the
user within the context of the system being analyzed.

• Potentially useful means that the discovered patterns should

lead to some benefit to the user or task.

• Ultimately understandable means that the pattern should

make business sense that leads to the user saying "mmm! It
makes sense; why didn't I think of that"
Definition of Data Mining
■ Data mining is not a new discipline, but rather a new
definition for the use of many disciplines.

■ Data mining is tightly positioned at the intersection of

many disciplines, including statistics, artificial
intelligence , machine learning, management science,
information systems, and databases ( see next Figure)
How Data Mining Works
■ Using existing and relevant data, data mining builds models to identify patterns among
the attributes presented in the data set.
■ Models are the mathematical representations that identify the patterns among the
attributes of the objects (e.g., customers) described in the data set.
■ Some of these patterns are explanatory (explaining the interrelationships and affinities
among the attributes), whereas others are predictive (foretelling future values of certain
attributes)

■ Types of patterns
1. Association
2. Prediction
3. Cluster (segmentation)
4. Sequential (or time series) relationships
Data Mining Applications

■ Customer Relationship Management

– Maximize return on marketing campaigns

– Improve customer retention (churn analysis)

– Maximize customer value (cross-selling, upselling)

– Identify and treat most valued customer

Data Mining Applications
■ Banking
– automating the loan application process by predicting the most
probable defaulters.

– detecting fraudulent credit card and online-banking transactions.

– Maximize customer value (cross-selling, upselling)

– optimizing the cash return by forecasting the cash flow on

banking entities (e.g., ATM machines, banking branches).
Data Mining Applications
■ Manufacturers and Production:

– Predict/prevent machinery failures

– Identify anomalies in production systems to optimize the

use manufacturing capacity

– Discover novel patterns to improve product quality

Data Mining Applications
■ Medicine.
(1) identify novel patterns to improve survivability of patients with
cancer;

(2) predict success rates of organ transplantation patients to develop

better do nor-organ matching policies;

(3) identify the functions of different genes in the human chromosome

(known as genomics);

( 4) discover the relationships between symptoms and illnesses to make

informed and correct decisions in a timely manner.
Why Data Mining?
■ More intense competition at the global scale driven by customers' ever-changing
needs and wants

■ Recognition of the value in data sources

■ Availability of quality data on customers, vendors, transactions, Web, etc.

■ Consolidation of databases and other data repositories into a single location in the
form of a data warehouse.

■ The exponential increase in data processing and storage capabilities.

■ Significant reduction in the cost of hardware and software for data storage and
processing.
Data Mining Applications
■ Entertainment industry
– analyze viewer data to decide what programs to show during prime
time and how to maximize returns by knowing where to insert
advertisements;
– predict the financial success of movies before they are produced to
make investment decisions and to optimize the returns;
– forecast the demand at different locations and different times to
better schedule entertainment events and to optimally allocate
resources
Data Mining Applications
■ Sports.
■ Healthcare.
■ Insurance.
■ Travel industry
■ Government and defense.
■ Brokerage and securities trading.
■ Retailing and logistics.
Characteristics and Objectives of DM
The following are the major characteristics and objectives
of data mining:
▪Source of data for DM is often (but not always) a consolidated
data warehouse
▪DM environment is usually a client-server or a Web-based
information systems architecture
▪Data is the most critical ingredient for DM which may include
soft/unstructured data.
Characteristics and Objectives of DM

▪The miner is often an end user

▪ Striking it rich requires creative thinking.

■ Data mining tools are readily combined with other software

development tools. Thus, the mined data can be analyzed
and deployed quickly and easily.
DATA MINING PROCESS
Most common standard processes:
• CRISP-DM (Cross-Industry Standard Process for Data
Mining)
• SEMMA (Sample, Explore, Modify, Model, and Assess)
• KDD (Knowledge Discovery in Databases)
CRISP-DM
■ Step 1: Business Understanding:
• The key element of any data mining study is to know what the
study is for.
• Then a project plan for finding such knowledge is developed
that specifies the people responsible for collecting the data,
analyzing the data, and reporting the findings.
• At this early stage, a budget to support the study should also be
established.
Step 2: Data Understanding:
■ First, the analyst should be clear and concise about the description of the data
mining task so that the most relevant data can be identified

■ Furthermore, the analyst should understand:

– the data sources (e.g., where the relevant data are stored and in what form;
what the process of collecting the data is-automated versus manual; who
the collectors of the data are and how often the data are updated)

– the variables (e.g., What are the most relevant variables? Are the variables
independent of each other- do they stand as a complete information source
without overlapping or conflicting information?)
Step 2: Data Understanding:
■ Data sources for data selection can vary. Normally, data sources for
business applications include:
– demographic data (such as income, education, number of
households, and age),
– sociographic data (such as hobby, club membership, and
entertainment),
– transactional data (sales record, credit card spending, issued
checks), and so on.
Step 2: Data Understanding:
■ Data can be categorized as quantitative and qualitative.

■ Quantitative data is measured using numeric values. It can be discrete (such as

integers) or continuous (such as real numbers).

■ Qualitative data, also known as categorical data, contains both nominal and
ordinal data.

–Nominal data has finite nonordered values (e.g., gender data, which has two
values: male and female).

–Ordinal data has finite ordered values. For example, customer credit ratings are
considered ordinal data because the ratings can be excellent, fair, and bad.
Step 3: Data Preparation/ Data Preprocessing
- The goal is taking the data identified in the previous step and
preparing it for analysis by data mining methods.
- Compared to the other steps, data preprocessing consumes the most
time and effort. ( why?)
- The real-world data is
- incomplete (lacking attribute values, lacking certain attributes of
interest, or containing only aggregate data)
- noisy (containing errors or outliers)
- inconsistent (containing discrepancies in codes or names).
Step 3: Data
Preparation
Stages
Step 3: Data Preparation
■ Data Preparation involves four main steps:
1. Data Consolidation

■ In the first phase of data preprocessing, the relevant data is collected

from the identified sources (accomplished in the previous step),

■ the necessary records and variables are selected (based on an

intimate understanding of the data, the unnecessary sections are
filtered out), and the records coming from multiple data sources are
integrated.
Step 3: Data Preparation

2. Data Cleaning (data scrubbing ):

■ In the second phase of data preprocessing, the data is cleaned
■ In this step, the values in the data set are identified and dealt with.
■ In some cases, missing values are an anomaly in the data set, in
which case they need to be imputed (filled with a most probable
value) or ignored;
■ In other cases, the missing values are a natural part of the data set
(e.g., the household income field is often left unanswered by people
who are in the top income tier).
Step 3: Data Preparation
2. Data Cleaning:
■ In this step, the analyst should also identify noisy values in the data (i.e.,
the outliers) and smooth them out.

■ Additionally, inconsistencies (unusual values within a variable) in the data

should be handled using domain knowledge and/or expert opinion
Step 3: Data Preparation
3. Data Transformation:
■ In the third phase , the data is transformed for better processing.

■ in many cases the data is normalized between a certain

minimum and maximum for all variables in order to mitigate the
potential bias of one variable (having large numeric values, such
as for household income) dominating other variables having
smaller values.
Step 3: Data Preparation
3. Data Transformation
■ Another transformation is discretization and/or aggregation. In some
cases, the numeric variables are converted to categorical values (e .g.,
low, medium, high)

■ in other cases one might choose to create new variables based on the
existing ones to magnify the information found in variables in the data
set.
Step 3: Data Preparation
4. Data Reduction:
■ Even though data miners like to have large data sets, too much data is also
a problem.

■ In the simplest sense, data can be visualized in data mining projects as a

flat file consisting of two dimensions:

– variables (the number of columns)

– cases/ records (the number of rows).

Step 3: Data Preparation
4. Data Reduction:
■ In some cases, the number of variables can be rather large, and the analyst must
reduce the number to a manageable size.

■ some data sets may include millions of records. Even though computing power is
increasing exponentially, processing such as large records may not be practical or
feasible. In such cases, one may need to sample a subset of the data for analysis.

■ The analyst should be careful in selecting a subset of the data that reflects the
essence of the complete data set and is not specific to a subgroup or subcategory.
Step 4: Model Building
■ In this step, various modeling techniques are selected and applied to a prepared
data set in order to address the specific business need.

■ The model-building step also encompasses the assessment and comparative

analysis of the various models built.(why).

■ Some methods may have specific requirements on the way that the data is to be
formatted; thus, stepping back to the data preparation step is often necessary.

■ Depending on the business need, the data mining task can be classification, an
association, or a clustering type,
Step 5: Testing and Evaluation
■ In step 5, the developed models are assessed and evaluated for their accuracy.

■ This step assesses the degree to which the selected model (or models) meets the
business objectives and, if so, to what extent .

■ Another option is to test the developed model(s) in a real-world scenario if time and
budget constraints permit.

■ This step is a critical and challenging task. No value is added by the data mining task
until the business value obtained from discovered patterns is identified and
recognized.
Step 5: Testing and Evaluation
■ The success of this step depends on the interaction among data analysts,
business analysts, and decision makers (such as business managers).

■ Because data analysts may not have the full understanding of the data
mining objectives and what they mean to the business

■ and the business analysts and decision makers may not have the technical
knowledge to interpret the results of sophisticated mathematical solutions,
interaction among them is necessary.
Step 6: Deployment
■ Depending on the requirements, the deployment phase can be as simple as
generating a report or as complex as implementing a repeatable data mining
process across the enterprise.

■ The deployment step may also include maintenance activities for the deployed
models.

■ Over time, the models (and the patterns embedded within them) built on the
old data may become obsolete, irrelevant, or misleading.
SEMMA
■ SEMMA ="sample, explore, modify, model, and assess."
■ Sample= Beginning with a statistically representative sample of the
data.

■ Explore= apply exploratory statistical and visualization techniques,

select and transform the most significant predictive variables.
■ Model= model the variables to predict outcomes.
■ Assess= confirm a model's accuracy.
CRISP-DM VS SEMMA

■ CRISP-DM takes a more comprehensive approach-including

understanding of the business and the relevant data-to data
mining projects, whereas SEMMA assumes that the data
mining project's goals and data sources have been
identified and understood.
KDD Process
■ knowledge discovery in databases as a process of using data mining
methods to find useful information and patterns in the data.
■ KDD is a comprehensive process that encompasses data mining.
■ The input to the KDD process is organizational data. EDW enables KDD to
be implemented because it provides a single source for data to be mined.
■ Dunham (2003) summarized the KDD process as consisting of the following
steps:
– data selection
– data preprocessing,
– data transformation
– data mining,
– interpretation/ evaluation.
KDD Process
Data Mining Methods
■ Classification
■ the most frequently used data mining method for real-world problems.

■ Member of the machine-learning family of techniques.

■ Learn from past data to classify new data.

– Ex, using classification to predict whether the weather on a particular day will be
"sunny," "rainy,“ or "cloudy."

– Popular classification tasks include credit approval (i.e., good or bad credit, risk),

– store location (e.g., good, moderate, bad),

GENERAL APPROACH
TO BUILD
CLASSIFICATION
MODEL
Data Mining Methods
■ Classification
■ The most common two-step methodology of classification-type
prediction involves:
– model training and,
– model testing.
■ In the model development phase, a collection of input data, including
the actual class labels, is used.
■ After a model has been trained, the model is tested against the
holdout sample for accuracy assessment
■ and eventually deployed for actual use where it is to predict classes of
new data instances (where the class label is unknown).
Estimating the True Accuracy of Classification Models

Confusion Matrix (or a Classification Matrix)

■ A confusion matrix is a table that is often used to describe the performance of a
classification model (or "classifier") on a set of test data for which the true
values are known.
■ The next figure represents a confusion matrix for a binary classifier.
■ Binary: possesses only two values i.e. True or False.
■ When the classification problem is not binary, the confusion matrix gets bigger.
Confusion Matrix for a binary classifier
Confusion Matrix
■ true positives (TP): These are cases in which we predicted yes (the products are
useful), and actually they are useful.
■ true negatives (TN): We predicted no, and they are not useful.
■ false positives (FP): We predicted yes, but they aren't actually useful.
■ false negatives (FN): We predicted no, but they are actually useful.
■ This is a list of rates that are often computed from a confusion matrix for a binary
classifier:
■ Accuracy: Overall, how often is the classifier correct?
– (TP+TN)/total
■ Misclassification Rate: Overall, how often is it wrong?
– (FP+FN)/total
– equivalent to 1 minus Accuracy
Confusion Matrix
■ True Positive Rate: When it's actually yes, how often does it predict yes?
– TP/actual yes
– also known as "Recall"
■ False Positive Rate: When it's actually no, how often does it predict yes?
– FP/actual no
■ True Negative Rate: When it's actually no, how often does it predict no?
– TN/actual no
– equivalent to 1 minus False Positive Rate
■ Precision: When it predicts yes, how often is it correct?
– TP/predicted yes
An Example of Confusion Matrix
■ Accuracy:
– (TP+TN)/total = (100+50)/165 = 0.91
■ Misclassification Rate:
– (FP+FN)/total = (10+5)/165 = 0.09
■ True Positive Rate:
– TP/actual yes = 100/105 = 0.95
■ False Positive Rate:
– FP/actual no = 10/60 = 0.17
■ True Negative Rate:
– TN/actual no = 50/60 = 0.83
■ Precision:
– TP/predicted yes = 100/110 = 0.91
Estimation Methodologies for Classification
■ Simple split (or test sample estimation):
■ Dividing the data into two mutually subsets called:
– a training set and,
– a test set (or holdout set).
■ It is common to designate two-thirds of the data as the training set and the remaining one-third
as the test set.
■ The training set is used by the inducer (model builder), and the built classifier is then tested on
the test set.
Estimation Methodologies for Classification
■ k-Fold Cross Validation
■ It is where a given data set is split into a K number of sections/folds where each
fold is used as a testing set at some point.
■ in case of having: 5-Fold cross validation(K=5).
■ Here, the data set is split into 5 folds.
■ In the first iteration, the first fold is used to test the model and the rest are used
to train the model.
■ In the second iteration, 2nd fold is used as the testing set while the rest serve as
the training set.
■ This process is repeated until each fold of the 5 folds have been used as the
testing set.
K-Fold Cross Validation
Classification Techniques: Decision Tree.
■ It refers to the form of a tree structure. It breaks down a dataset into smaller and
smaller subsets while at the same time an associated decision tree is gradually
developed.

■ A tree can be “learned” by splitting the source set into subsets based on an attribute
value test (Inputs) . This process is repeated on each derived subset in a recursive
manner called ”recursive partitioning”.

■ The basic idea is to ask questions whose answers would provide the most
information.

■ Decision trees are one of the most popular machine learning algorithms, and it can
handle both categorical and numerical data.
The tree has three types of nodes:

A root node that has no incoming edges and zero

or more outcomes edges.

Decision Tree Internal nodes each of which has exactly one

incoming edges and two or more outgoing edges.

Leaf or terminal nodes: each of which has exactly

one incoming edge and no outgoing edges.
A decision tree for the mammal classification problem
Cluster Analysis for Data Mining

■ Cluster analysis or clustering is the process of partitioning a set of data objects

(or observations) into subsets.

■ Each subset is a cluster, such that objects in a cluster are similar to one
another, yet dissimilar to objects in other clusters.

■ Different clustering methods may generate different clusterings on the same

data set.

■ The partitioning is not performed by humans, but by the clustering algorithm.

Cluster Analysis for Data Mining
Cluster Analysis for Data Mining
■ The method is commonly used in biology, medicine, genetics, social network
analysis, anthropology, archaeology.

■ Cluster analysis has been used extensively for fraud detection and market
segmentation of customers in CRM systems.

■ Clustering has also found many applications in Web search.

■ Clustering can also be used for outlier detection, where outliers (values that are
“far away” from any cluster) may be more interesting than common cases.
Cluster Analysis for Data Mining
■ Applications of outlier detection include the detection of credit card fraud and
the monitoring of criminal activities in e-commerce.

– Ex, unusual cases in credit card transactions, such as very expensive and
infrequent purchases, may be of interest as possible fraudulent activities.

■ Clustering is known as unsupervised learning because the class label

information is not present. For this reason, clustering is a form of learning by
observation, rather than learning by examples.

■ Most cluster analysis methods involve the use of a distance measure to

calculate the closeness between pairs of items.
ANALYSIS METHODS
■ Cluster analysis may be based on one or more of the following general
methods:
– Statistical methods, such as k-means, k-modes, and so on
– Neural networks
– Fuzzy logic
– Genetic algorithms
■ Each of these methods generally works with one of two general method
classes:
– Divisive: With divisive classes, all items start in one cluster and are
broken apart.
– Agglomerative: With agglomerative classes, all items start in individual
clusters, and the clusters are joined together.
Classification Vs. Clustering
■ classification uses predefined classes in which objects are assigned, while
clustering identifies similarities between objects, which it groups according to those
characteristics in common and which differentiate them from other groups of objects.

■ clustering is framed in unsupervised learning; that is, for this type of algorithm we only
have one set of input data (not labelled), about which we must obtain information, without
previously knowing what the output will be.

■ On the other hand, classification belongs to supervised learning, which means that we
know the input data (labeled in this case) and we know the possible output of the
algorithm.
Association Rule Mining
■ Association rule mining (also known as affinity analysis or market-basket
analysis) is a popular data mining method.
■ Part of machine learning family
■ It aims to find interesting relationships (affinities) between variables (items)
in large databases.
■ The input to market-basket analysis is simple point-of-sale transaction data,
where a number of products and/or services purchased together
■ Then they are tabulated under a single transaction instance.
■ The outcome of the analysis is invaluable information.
Association Rule Mining

A business can take advantage of such knowledge by:

(1) putting the items next to each other to make it more convenient for the
customers to pick them up together and not forget to buy one when buying the
others (increasing sales volume);

(2) promoting the items as a package (do not put one on sale if the other(s) are on
sale)

(3) placing them apart from each other so that the customer has to walk the aisles
to search for it, and by doing so potentially seeing and buying other items.
Association Rule Mining
Are all association rules interesting and useful?
■ A Generic Rule: X → Y [S%, C%]
■ X, Y: products and/or services
■ X: Left-hand-side (LHS) or (antecedent),
■ Y: Right-hand-side (RHS) or (consequent)
■ S: Support: how often X and Y go together
■ C: Confidence: how often Y go together with the X
■ Example: {Laptop Computer, Antivirus Software} → {Extended Service
Plan} [30%, 70%]
Association Rule Mining
■ Support refers to the percentage of baskets where the rule was true (both left
and right side products were present).

■ Confidence measures what percentage of how often the products on the RHS
go together with the products on the LHS

■ Lift measures how much more frequently the left-hand item is found with the
right than without the right.
Apriori Algorithm

• Finds subsets that are common to at least a minimum number of the itemsets
• uses a bottom-up approach
– frequent subsets are extended one item at a time (the size of frequent
subsets increases from one-item subsets to two-item subsets, then three-
item subsets, and so on), and
– groups of candidates at each level are tested against the data for
minimum support
– see the figure…
Apriori Algorithm
Data Mining Myths
Myths Reality
Data mining provides instant, crystal-ball-like Data mining is a multistep process that requires
predictions. thoughtful, proactive design and use.

Data mining is not yet viable for business The current state-of-the-art is ready to go for
applications. almost any business.

Data mining requires a separate, devoted Because of advances in database technology, a

database. devoted database is not required, even though it
may be desirable.
Only those with advanced degrees can do data Newer Web-based tools enable managers of all
mining. educational levels to do data mining.

Data mining is only for large firms that have lots If the data accurately reflect the business or its
of customer data customers, a company can use data mining.
Common Data Mining Mistakes
■ selecting the wrong problem for data mining

■ leaving insufficient time for data preparation,

■ Looking only at aggregated results and not at individual records

■ Being sloppy about keeping track of the data mining procedure and results

■ Ignoring suspicious (good or bad) findings and quickly moving on.

■ Running mining algorithms repeatedly and blindly, without thinking about the
next stage.

Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners (Naren) PDF
No ratings yet
Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners (Naren) PDF
68 pages
Associating Fundamental Features With Technical Indicators For Analyzing Quarterly Stock Market Trends Using Machine Learning Algorithms
No ratings yet
Associating Fundamental Features With Technical Indicators For Analyzing Quarterly Stock Market Trends Using Machine Learning Algorithms
16 pages
Data Analysis - Groups - INCOMPLETE
No ratings yet
Data Analysis - Groups - INCOMPLETE
24 pages
A Complete Tutorial To Learn Data Science With Python From Scratch
No ratings yet
A Complete Tutorial To Learn Data Science With Python From Scratch
68 pages
Data Mining
No ratings yet
Data Mining
41 pages
BIDW Lecture 2
No ratings yet
BIDW Lecture 2
33 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
5 Data Mining Proccess and Techniques - Week 7
No ratings yet
5 Data Mining Proccess and Techniques - Week 7
61 pages
PredictiveAnalysis U1 U2
No ratings yet
PredictiveAnalysis U1 U2
7 pages
Mehrdad Jalali: Jalali@mshdiau - Ac.ir Jalali - Mshdiau.ac - Ir
No ratings yet
Mehrdad Jalali: Jalali@mshdiau - Ac.ir Jalali - Mshdiau.ac - Ir
27 pages
Chapter Five Data Mining for Healthcare Analytics
No ratings yet
Chapter Five Data Mining for Healthcare Analytics
77 pages
Introduction To Data Mining-Week1
No ratings yet
Introduction To Data Mining-Week1
43 pages
Data Mining Cognate
No ratings yet
Data Mining Cognate
23 pages
Data Mining Concepts
100% (3)
Data Mining Concepts
122 pages
Data Mining
No ratings yet
Data Mining
13 pages
Data Mining.intro
No ratings yet
Data Mining.intro
17 pages
Lecture 7 8 Data Mining
No ratings yet
Lecture 7 8 Data Mining
23 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
IBA - MODULe 4.3
No ratings yet
IBA - MODULe 4.3
10 pages
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
No ratings yet
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
35 pages
Unit - I
No ratings yet
Unit - I
22 pages
What Is Data Mining
No ratings yet
What Is Data Mining
8 pages
1712060004 (1)
No ratings yet
1712060004 (1)
25 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
15 pages
DM Chapter 1
No ratings yet
DM Chapter 1
37 pages
1 Intro
No ratings yet
1 Intro
33 pages
Data Mining
No ratings yet
Data Mining
6 pages
Data Mining, Data Pattern, Machine Learning (Week 2
No ratings yet
Data Mining, Data Pattern, Machine Learning (Week 2
19 pages
datamining&warehousing
No ratings yet
datamining&warehousing
65 pages
My Chapter Two
No ratings yet
My Chapter Two
57 pages
unit 3 BI & Data science (1)
No ratings yet
unit 3 BI & Data science (1)
19 pages
Data Mining Nostos
100% (1)
Data Mining Nostos
39 pages
Introduction-to-Data-Mining
No ratings yet
Introduction-to-Data-Mining
32 pages
Data Mining and Decision Trees: Prof. Sin-Min Lee Department of Computer Science
No ratings yet
Data Mining and Decision Trees: Prof. Sin-Min Lee Department of Computer Science
66 pages
Presentation On Data Mining
100% (1)
Presentation On Data Mining
51 pages
Data Mining & Data Warehousing
No ratings yet
Data Mining & Data Warehousing
62 pages
Chapter 6 Data Mining
No ratings yet
Chapter 6 Data Mining
39 pages
Digital Data Mining Nostos - FP
No ratings yet
Digital Data Mining Nostos - FP
37 pages
Data Mining - An Overview
No ratings yet
Data Mining - An Overview
40 pages
09-Datamining Concepts
100% (1)
09-Datamining Concepts
121 pages
Data Mining-CH5
No ratings yet
Data Mining-CH5
49 pages
Data Mining e Resources
No ratings yet
Data Mining e Resources
98 pages
Lecture 7 & 8 Data Mining
No ratings yet
Lecture 7 & 8 Data Mining
21 pages
DM ITERA 2020 w1
No ratings yet
DM ITERA 2020 w1
35 pages
Presentation 1
No ratings yet
Presentation 1
28 pages
Data Mining 1
No ratings yet
Data Mining 1
56 pages
DATA Mining
No ratings yet
DATA Mining
21 pages
Business Intelligence Data Mining: (John Naisbett)
No ratings yet
Business Intelligence Data Mining: (John Naisbett)
60 pages
Data Mining and Warehousing-1
No ratings yet
Data Mining and Warehousing-1
43 pages
Data Mining Concepts
No ratings yet
Data Mining Concepts
35 pages
BDA Class1
No ratings yet
BDA Class1
33 pages
Topic10 - Data Mining
No ratings yet
Topic10 - Data Mining
29 pages
BIS 541 Ch01 20-21 S
No ratings yet
BIS 541 Ch01 20-21 S
129 pages
Major Issues in Data Mining
75% (4)
Major Issues in Data Mining
45 pages
DM Module1
No ratings yet
DM Module1
15 pages
Unit 3 Ba
No ratings yet
Unit 3 Ba
29 pages
intro data mining
No ratings yet
intro data mining
51 pages
Data Mining
No ratings yet
Data Mining
63 pages
Lecture_01_11jan
No ratings yet
Lecture_01_11jan
29 pages
01-Introduction To Data Mining
No ratings yet
01-Introduction To Data Mining
43 pages
What Is Data Mining: Effective Data Collection Warehousing
No ratings yet
What Is Data Mining: Effective Data Collection Warehousing
21 pages
Data Mining: Concepts and Techniques
100% (2)
Data Mining: Concepts and Techniques
27 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
27 pages
Data Collection: Six Sigma Thinking, #1
From Everand
Data Collection: Six Sigma Thinking, #1
Sumeet Savant
No ratings yet
Comparison of Modeling Methods For Loss Given Default
No ratings yet
Comparison of Modeling Methods For Loss Given Default
14 pages
Data Science
No ratings yet
Data Science
11 pages
Solutions
No ratings yet
Solutions
25 pages
Chapter 5 Decision Tree Assignment
No ratings yet
Chapter 5 Decision Tree Assignment
3 pages
To Design and Implement Application For Bank Customer Churning Rate Prediction and Analysis Using Machine Learning Algorithm
No ratings yet
To Design and Implement Application For Bank Customer Churning Rate Prediction and Analysis Using Machine Learning Algorithm
4 pages
Session 17-Decision Tree
No ratings yet
Session 17-Decision Tree
16 pages
Crop Recommendation System Using KNN and Random Forest Considering Indian Dataset
No ratings yet
Crop Recommendation System Using KNN and Random Forest Considering Indian Dataset
13 pages
Dengue Fever Prediction A Data Mining Problem 2153 0602 1000181 PDF
No ratings yet
Dengue Fever Prediction A Data Mining Problem 2153 0602 1000181 PDF
5 pages
Internship Report File
No ratings yet
Internship Report File
35 pages
Lab Workbook With Solutions-Final PDF
100% (5)
Lab Workbook With Solutions-Final PDF
109 pages
Proposal
No ratings yet
Proposal
31 pages
Project Report Hate
100% (1)
Project Report Hate
24 pages
Prediction of House Price, Bank Campaigning Status and Bank Loan Status Using Machine Learning Algorithms
No ratings yet
Prediction of House Price, Bank Campaigning Status and Bank Loan Status Using Machine Learning Algorithms
9 pages
CMRIT B.tech Minor Honors Courses Regulations Syllabus
No ratings yet
CMRIT B.tech Minor Honors Courses Regulations Syllabus
75 pages
Heart Disease Prediction Using Machine Learning Report
50% (2)
Heart Disease Prediction Using Machine Learning Report
45 pages
Learning: Chapter 17: Rich & Knight
No ratings yet
Learning: Chapter 17: Rich & Knight
30 pages
Student Grade
No ratings yet
Student Grade
54 pages
Classification Algorithms Used in Data Mining. This Is A Lecture Given To MSC Students.
100% (5)
Classification Algorithms Used in Data Mining. This Is A Lecture Given To MSC Students.
63 pages
Corentin Herbinet Using Machine Learning Techniques To Predict The Outcome of Profressional Football Matches
No ratings yet
Corentin Herbinet Using Machine Learning Techniques To Predict The Outcome of Profressional Football Matches
73 pages
ML Assignment No 1
No ratings yet
ML Assignment No 1
2 pages
Sales Prediction Model For Big Mart: Parichay: Maharaja Surajmal Institute Journal of Applied Research
No ratings yet
Sales Prediction Model For Big Mart: Parichay: Maharaja Surajmal Institute Journal of Applied Research
11 pages
Development of Student's Academic Performance Prediction Model
No ratings yet
Development of Student's Academic Performance Prediction Model
16 pages
PRJT
No ratings yet
PRJT
26 pages
ML Unit-Ii Notes
No ratings yet
ML Unit-Ii Notes
17 pages
Machine Learing Algorithms
No ratings yet
Machine Learing Algorithms
13 pages
Ml Mid-2 Objective[1]
No ratings yet
Ml Mid-2 Objective[1]
12 pages