UN Data minig

Uploaded by

reyya1243

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

UN Data minig

Uploaded by

reyya1243

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

UNIT-3

General Approach to Classification

“How does classification work?” Data
classification is a two-step process,
• learning step (where a classification model is
constructed)
• classification step (where the model is used to
predict class labels for given data).
The process is shown for the loan application
data of Figure
The data classification process:
• (a) Learning: Training data are analyzed by a
classification algorithm. Here, the class label
attribute is loan decision, and the learned
model or classifier is represented in the form of
classification rules.

• (b) Classification: Test data are used to estimate

the accuracy of the classification rules. If the
accuracy is considered acceptable, the rules can
be applied to the classification of new data
tuples.
• supervised learning: the class label of each
training tuple is provided.

• unsupervised learning: the class label of each

training tuple is not known, and the number
or set of classes to be learned may not be
known in advance. we could use clustering to
try to determine “groups of like tuples,”.
Decision Tree Induction
• A decision tree is a flowchart-like tree
structure, where each internal node (nonleaf
node) denotes a test on an attribute, each
branch represents an outcome of the test, and
each leaf node (or terminal node) holds a class
label. The topmost node in a tree is the root
node.
• A decision tree for the concept buys computer,
indicating whether an AllElectronics customer
is likely to purchase a computer. Each internal
(nonleaf) node represents a test on an
attribute. Each leaf node represents a class
(either buys computer = yes or buys computer
= no).
Attribute Selection Measures
• Information Gain:
ID3 uses information gain as its attribute
selection measure
The expected information needed to classify a
tuple in D is given by
• How much more information would we still
need (after the partitioning) to arrive at an
exact classification? This amount is measured
by
• Information gain is defined as the difference
between the original information requirement
(i.e., based on just the proportion of classes)
and the new requirement (i.e., obtained after
partitioning on A). That is
Step-by-Step Decision Tree Induction on the
"Buy Computer" Example
• The "Buy Computer" dataset is given as:
Step 1: Entropy of the Entire Dataset S
• Formula for Entropy:
In the dataset:
There are 10 instances.
• 5 instances are "Yes" (i.e., customers buy a
computer).
• 5 instances are "No" (i.e., customers do not
buy a computer).
• The probability for each class is:
Using these values in the entropy formula:
Step 2: Information Gain for Each Attribute

• Formula for Information Gain:

Step 3: Information Gain for Attribute: Age

• The possible values for Age are "Young," "Middle-

aged," and "Senior."
• We will:
– Split the dataset into three subsets based on the values
of Age.
– Calculate the entropy for each subset.
– Use the information gain formula to calculate the gain.
• Subset 1: Age = Young
• Subset of instances where Age = "Young":
• 4 instances are "No".
• 1 instance is "Yes".
• The entropy of this subset is:
• Subset 2: Age = Middle-aged
Subset of instances where Age = "Middle-aged":

• 2 instances are "Yes".

• 0 instances are "No" (this subset is pure).
The entropy of this subset is:
• Subset 3: Age = Senior
Subset of instances where Age = "Senior":

• 3 instances are "Yes".

• 1 instance is "No".
The entropy of this subset is:
Step 4: Calculate Weighted Average Entropy
for Age
• Now, we calculate the weighted average
entropy for the attribute Age. The formula is:
Step 5: Information Gain for Age
• Finally, we calculate the information gain for
the attribute Age:

Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Classification: Basic Concepts
No ratings yet
Classification: Basic Concepts
73 pages
Designing An Improved Id3 Decision Tree Algorithm
No ratings yet
Designing An Improved Id3 Decision Tree Algorithm
5 pages
Unit 4 DM
No ratings yet
Unit 4 DM
88 pages
08ClassBasic-L
No ratings yet
08ClassBasic-L
78 pages
6CS4-02 Machine Learning Manish Bhardwaj
No ratings yet
6CS4-02 Machine Learning Manish Bhardwaj
625 pages
04 Classification
No ratings yet
04 Classification
72 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
Classification Ppts 2021
No ratings yet
Classification Ppts 2021
80 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
Chapter#03 Supervised Learning and Its Algorithms - III
No ratings yet
Chapter#03 Supervised Learning and Its Algorithms - III
29 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
81 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
Mod 3 part1_merged
No ratings yet
Mod 3 part1_merged
101 pages
Decision Tree
100% (1)
Decision Tree
12 pages
AI_01_ID3
No ratings yet
AI_01_ID3
7 pages
_08ClassBasic_v1
No ratings yet
_08ClassBasic_v1
46 pages
08 Class Basic
No ratings yet
08 Class Basic
86 pages
Data Mining Book
No ratings yet
Data Mining Book
84 pages
Classification and Prediction
No ratings yet
Classification and Prediction
40 pages
CH 5
No ratings yet
CH 5
84 pages
Unit-3 (MLT)
No ratings yet
Unit-3 (MLT)
46 pages
08 Class Basic
No ratings yet
08 Class Basic
76 pages
COMP 6930 Topic01 Classification Basics
No ratings yet
COMP 6930 Topic01 Classification Basics
190 pages
Classification and Prediction
100% (1)
Classification and Prediction
31 pages
06-Classification_Part1
No ratings yet
06-Classification_Part1
44 pages
Data Mining: Classification
No ratings yet
Data Mining: Classification
70 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
88 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
05 Classification
No ratings yet
05 Classification
79 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
dm4
No ratings yet
dm4
68 pages
ID3 Algorithm & ROC Analysis
No ratings yet
ID3 Algorithm & ROC Analysis
51 pages
Copy of Classification-1
No ratings yet
Copy of Classification-1
48 pages
Supervised Learning
No ratings yet
Supervised Learning
41 pages
DM Lect 9_Classification - Decision Trees
No ratings yet
DM Lect 9_Classification - Decision Trees
39 pages
Data Mining Unit-Iii
No ratings yet
Data Mining Unit-Iii
36 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
Unit 3-Classification
No ratings yet
Unit 3-Classification
71 pages
Module - 4.1-DM-1
No ratings yet
Module - 4.1-DM-1
63 pages
Unit 4
No ratings yet
Unit 4
78 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
Decision Trees
No ratings yet
Decision Trees
14 pages
Lecture 4
No ratings yet
Lecture 4
79 pages
Topic01 Classification Basics Jiawei Han Extra
No ratings yet
Topic01 Classification Basics Jiawei Han Extra
198 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
42 pages
Classification and Prediction
No ratings yet
Classification and Prediction
143 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
Classification
No ratings yet
Classification
73 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Decision Tree 2
No ratings yet
Decision Tree 2
20 pages
DM Unit-3
No ratings yet
DM Unit-3
46 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
Data Mining Unit 3
No ratings yet
Data Mining Unit 3
50 pages
CH 5
No ratings yet
CH 5
81 pages
Concepts and Techniques: Data Mining
100% (1)
Concepts and Techniques: Data Mining
81 pages
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
From Everand
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
Peter Bradley
No ratings yet
Artificial Intelligence Algorithms
From Everand
Artificial Intelligence Algorithms
akosnemeth
No ratings yet
Science First Quarter First Summative Test
No ratings yet
Science First Quarter First Summative Test
4 pages
Marine Catalogue
No ratings yet
Marine Catalogue
44 pages
45º Elbows: Standard Weight
No ratings yet
45º Elbows: Standard Weight
2 pages
Design, Implementation, and Evaluation of An XG-PON Module For ns-3 Network Simulator
No ratings yet
Design, Implementation, and Evaluation of An XG-PON Module For ns-3 Network Simulator
15 pages
TNG Ewallet Transactions 7
No ratings yet
TNG Ewallet Transactions 7
10 pages
625-43 - 825-42K Supertrack Kit (Rev. 04-2015)
No ratings yet
625-43 - 825-42K Supertrack Kit (Rev. 04-2015)
6 pages
Bangalore Property Lawyers Funda
No ratings yet
Bangalore Property Lawyers Funda
15 pages
Voucher Dayat
No ratings yet
Voucher Dayat
1 page
Project Guidelines
No ratings yet
Project Guidelines
5 pages
資格考試認可電子計算機型號名單
No ratings yet
資格考試認可電子計算機型號名單
2 pages
ECD Exp 3
No ratings yet
ECD Exp 3
2 pages
The Circular Economy. A New Sustainability Paradigm PDF
No ratings yet
The Circular Economy. A New Sustainability Paradigm PDF
12 pages
Rubric For Video Presentation
No ratings yet
Rubric For Video Presentation
1 page
Faculty of Engineering & Built Environment Subject: Ege 3411 Laboratory Investigations 2 Experiment 1: Gear Train
No ratings yet
Faculty of Engineering & Built Environment Subject: Ege 3411 Laboratory Investigations 2 Experiment 1: Gear Train
7 pages
EMB 110 Normal Procedures
100% (1)
EMB 110 Normal Procedures
16 pages
Diy Lab Project
No ratings yet
Diy Lab Project
19 pages
07 Tender Opening Committee
No ratings yet
07 Tender Opening Committee
4 pages
Cat 6
No ratings yet
Cat 6
16 pages
Pumping Stations Design Lecture 6
No ratings yet
Pumping Stations Design Lecture 6
45 pages
Despiese Bomba de Direccion D8T
No ratings yet
Despiese Bomba de Direccion D8T
3 pages
DS-1 4th Ed - Volume 3 - Addendum 2
100% (2)
DS-1 4th Ed - Volume 3 - Addendum 2
7 pages
Introduction To Blowroom
100% (1)
Introduction To Blowroom
21 pages
Personal Statement Draft 1 - Elliot Burridge
No ratings yet
Personal Statement Draft 1 - Elliot Burridge
2 pages
RF23R6301SR Inverter
No ratings yet
RF23R6301SR Inverter
2 pages
PFD ISA Standard PDF
No ratings yet
PFD ISA Standard PDF
4 pages
How to Set the Valves and Injectors (Top Setting)
No ratings yet
How to Set the Valves and Injectors (Top Setting)
2 pages
Commercialization Plan Clean
100% (2)
Commercialization Plan Clean
9 pages
Netflix PDF
No ratings yet
Netflix PDF
12 pages
BOQ ITM R0, 3 Juni 2024
No ratings yet
BOQ ITM R0, 3 Juni 2024
5 pages
Dell Xps m1530 Intel Discrete - Wistron Hawke
No ratings yet
Dell Xps m1530 Intel Discrete - Wistron Hawke
57 pages

UN Data minig

Uploaded by

UN Data minig

Uploaded by

UNIT-3

General Approach to Classification

• (b) Classification: Test data are used to estimate

• unsupervised learning: the class label of each

• Formula for Information Gain:

• The possible values for Age are "Young," "Middle-

• 2 instances are "Yes".

• 3 instances are "Yes".

You might also like