0% found this document useful (0 votes)

17 views

ID3 Decision Tree Explanation

Uploaded by

simohod818

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

ID3 Decision Tree Explanation

Uploaded by

simohod818

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

ChatGPT - ID3 Decision Tree Explanation

The ID3 (Iterative Dichotomiser 3) algorithm is a popular decision tree algorithm used for
classification tasks. It was developed by Ross Quinlan in 1986 and is one of the simplest
algorithms for creating decision trees.

Key Concepts:
1. Decision Tree: A decision tree is a flowchart-like structure where each internal node
represents a decision based on the value of a feature (attribute), and each leaf node
represents the classification outcome (or decision).

2. ID3: The ID3 algorithm is used to build a decision tree by selecting the best feature to
split the data at each step, aiming to reduce uncertainty in the data. It uses information
gain as a criterion to determine which attribute to split on at each node.

Steps in ID3 Algorithm:

1. Start at the root:

The algorithm begins by considering all the data at the root node.

2. Calculate Information Gain:

Information Gain (IG) is used to measure how much uncertainty is reduced when
we split the data on an attribute. It is based on Entropy, a measure of disorder or
impurity in the dataset.

The formula for Entropy H(D) of a dataset D is:

k
H(D) = − ∑ pi log2 (pi )

i=1

Where:

pi is the probability of class i in the dataset D

k is the number of classes.

Information Gain is calculated as:

V
∣Dv ∣
IG(D, A) = H(D) − ∑ H(Dv )

∣D∣

v=1

Where:

1/8
A is the attribute.
Dv is the subset of D where attribute A has value v .

∣D∣ is the size of dataset D.

V is the number of possible values for the attribute A.

The attribute with the highest Information Gain is chosen for splitting the data.

3. Split the Data:

The dataset is split into subsets based on the selected attribute. Each subset is
formed by the unique values of the chosen attribute.

4. Repeat the Process:

For each subset created in the previous step, the algorithm recursively repeats the
process: calculate entropy, determine the best attribute, and split the data further.

5. Stopping Criteria:

The algorithm stops when:

All instances in a subset belong to the same class (i.e., no further splitting is
needed).

There are no more attributes to split on (all attributes have been used).

The dataset is empty.

Example (Simple):
Let's say we have a dataset with three attributes: Weather, Temperature, and PlayTennis
(whether the person plays tennis or not). We want to predict whether someone will play
tennis based on weather conditions and temperature.

Dataset:

Weather Temperature PlayTennis

Sunny Hot No

Sunny Mild Yes

Overcast Hot Yes

Overcast Cool Yes

Rainy Mild Yes

Rainy Cool No

2/8
Step-by-Step Process:

1. Calculate the Entropy of the Dataset:

We start by calculating the entropy of the target attribute PlayTennis . The classes are
"Yes" and "No". Based on the frequencies:

4 "Yes" (play tennis)

2 "No" (don’t play tennis)

The entropy for this dataset is:

4 4 2 2
H(PlayT ennis) = − log2 ( ) − log2 ( )
6 6 6 6

2. Calculate Information Gain for Each Attribute: For each attribute (Weather,
Temperature), calculate how much uncertainty is reduced if we split the data based on
that attribute. For example, for the Weather attribute:

If the weather is Sunny, the possible outcome is "No".

If the weather is Overcast, the possible outcome is "Yes".

If the weather is Rainy, the possible outcome is mixed, so we calculate the entropy
for each subset and compute the weighted average.

Calculate the information gain for Weather and Temperature, and choose the attribute
with the highest information gain.

3. Create the Root Node and Branches: If, say, the attribute Weather has the highest
information gain, we make Weather the root of the tree. We create branches for Sunny,
Overcast, and Rainy.

4. Repeat the Process for Each Branch: For each branch, the process repeats to decide
whether to split further, based on the best attribute at that node.

5. Construct the Decision Tree: Continue recursively until the tree is fully constructed.

Advantages and Disadvantages of ID3:

Advantages:

Simple to understand and easy to implement.

Works well with categorical data.

The tree structure is easy to interpret.

Disadvantages:

3/8
Prone to overfitting, especially if the tree is too deep.

Does not handle continuous attributes well (requires discretization).

Tends to favor attributes with more categories, which can lead to biased splits.

Conclusion:
The ID3 algorithm is a foundational decision tree algorithm that uses entropy and
information gain to create simple, interpretable classification models. Although it's powerful
for small datasets, it may need enhancements (like pruning) to handle larger, more complex
data effectively.

Let's go through a detailed example of the ID3 algorithm using a simple dataset and
visualize it with a decision tree diagram.

Example Dataset:
We are trying to predict whether a person will play tennis ( PlayTennis ) based on two
features:

1. Weather (Sunny, Overcast, Rainy)

2. Temperature (Hot, Mild, Cool)

Here’s the dataset:

Weather Temperature PlayTennis

Sunny Hot No

Sunny Mild Yes

Overcast Hot Yes

Overcast Cool Yes

Rainy Mild Yes

Rainy Cool No

Step 1: Calculate Entropy for the Entire Dataset

We first calculate the entropy of the target variable PlayTennis.

There are 6 instances in total:

4 instances where the person plays tennis (Yes)

2 instances where the person does not play tennis (No)

4/8
The entropy of a dataset D is calculated as:

k
H(D) = − ∑ pi log2 (pi )

i=1

For PlayTennis, the entropy is:

4 4 2 2
H(PlayT ennis) = − ( log2 ( ) + log2 ( ))
6 6 6 6

Let’s compute:

H(PlayT ennis) = − (0.6667 log2 (0.6667) + 0.3333 log2 (0.3333))

H(PlayT ennis) ≈ 0.918

Step 2: Calculate Information Gain for Each Attribute

Now, we calculate the Information Gain for each attribute: Weather and Temperature. The
Information Gain (IG) is the reduction in entropy when the data is split on a particular
attribute.

For Weather:

The possible values for Weather are Sunny, Overcast, and Rainy.

We will calculate the entropy for each subset of data created by these values.

Subset for Sunny:

Instances: Sunny, Hot, No and Sunny, Mild, Yes

Outcomes: 1 "No" and 1 "Yes"

Entropy of Sunny subset:

1 1 1 1
H(Sunny) = − ( log2 ( ) + log2 ( )) = 1
2 2 2 2

Subset for Overcast:

Instances: Overcast, Hot, Yes and Overcast, Cool, Yes

Outcome: 2 "Yes"

Entropy of Overcast subset:

H(Overcast) = 0 (since all are "Yes")

5/8
Subset for Rainy:

Instances: Rainy, Mild, Yes and Rainy, Cool, No

Outcomes: 1 "Yes" and 1 "No"

Entropy of Rainy subset:

H(Rainy) = 1

Now we calculate the Weighted Entropy for the Weather attribute:

2 2 2
H(W eather) = H(Sunny) + H(Overcast) + H(Rainy)
6 6 6

2 2 2
H(W eather) = × 1 + × 0 + × 1 = 0.667
6 6 6

Now, we calculate the Information Gain for Weather:

IG(W eather) = H(PlayT ennis) − H(W eather)

IG(W eather) = 0.918 − 0.667 = 0.251

For Temperature:

The possible values for Temperature are Hot, Mild, and Cool.

We calculate the entropy for each subset of data created by these values.

Subset for Hot:

Instances: Sunny, Hot, No and Overcast, Hot, Yes

Outcomes: 1 "No" and 1 "Yes"

Entropy of Hot subset:

H(Hot) = 1

Subset for Mild:

Instances: Sunny, Mild, Yes , Rainy, Mild, Yes

Outcomes: 2 "Yes"

Entropy of Mild subset:

H(M ild) = 0 (since all are "Yes")

6/8
Subset for Cool:

Instances: Overcast, Cool, Yes , Rainy, Cool, No

Outcomes: 1 "Yes" and 1 "No"

Entropy of Cool subset:

H(Cool) = 1

Now we calculate the Weighted Entropy for the Temperature attribute:

2 2 2
H(T emperature) = H(Hot) + H(M ild) + H(Cool)
6 6 6

2 2 2
H(T emperature) = × 1 + × 0 + × 1 = 0.667
6 6 6

Now, we calculate the Information Gain for Temperature:

IG(T emperature) = H(PlayT ennis) − H(T emperature)

IG(T emperature) = 0.918 − 0.667 = 0.251

Step 3: Choose the Best Attribute to Split

Since the Information Gain for both Weather and Temperature is the same (0.251), we can
choose either attribute to split on. For simplicity, let’s choose Weather.

Step 4: Create the Decision Tree

Now, we can build the decision tree. Starting with Weather as the root, we split into the
branches:

Sunny → PlayTennis = No (since both instances under Sunny have this outcome)

Overcast → PlayTennis = Yes (since both instances under Overcast have this outcome)

Rainy → We split further based on Temperature.

For Rainy:

Mild → PlayTennis = Yes

Cool → PlayTennis = No

Final Decision Tree Diagram:

yaml

Conclusion:
This decision tree can now be used to predict whether a person will play tennis based on
their weather conditions and temperature. For example:

If the weather is Sunny, the decision is No (do not play tennis).

If the weather is Overcast, the decision is Yes (play tennis).

If the weather is Rainy and the temperature is Mild, the decision is Yes (play tennis), but
if the temperature is Cool, the decision is No (do not play tennis).

This process, using information gain and entropy, helps in creating an efficient and
interpretable decision tree for classification tasks.

8/8

07_Decision tree
No ratings yet
07_Decision tree
45 pages
ID3_Complete_Solution
No ratings yet
ID3_Complete_Solution
3 pages
Decision Tree
100% (1)
Decision Tree
10 pages
Lecture2 DT
No ratings yet
Lecture2 DT
75 pages
What Is An ID3 Algorithm?
No ratings yet
What Is An ID3 Algorithm?
10 pages
Decision Tree
No ratings yet
Decision Tree
27 pages
id3algorithm-200307175839
No ratings yet
id3algorithm-200307175839
22 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
ML-19 (1)
No ratings yet
ML-19 (1)
28 pages
ML_Unit-3
No ratings yet
ML_Unit-3
29 pages
Lec-2 Decision Tree_13-8-2024
No ratings yet
Lec-2 Decision Tree_13-8-2024
38 pages
Unit 4 - Decision Tree ID3
No ratings yet
Unit 4 - Decision Tree ID3
5 pages
06 Classification Decision Tree
No ratings yet
06 Classification Decision Tree
42 pages
7. Decision Tree & Random Forest
No ratings yet
7. Decision Tree & Random Forest
41 pages
Decision Tree (Class 37-38) 169692509554958626652505a71d481
No ratings yet
Decision Tree (Class 37-38) 169692509554958626652505a71d481
45 pages
Decision Tree Classification
100% (1)
Decision Tree Classification
11 pages
3 Decision Trees_LMS
No ratings yet
3 Decision Trees_LMS
47 pages
Decision Tree
No ratings yet
Decision Tree
100 pages
Decision Tree
No ratings yet
Decision Tree
36 pages
MLT UNIT-3 notes
No ratings yet
MLT UNIT-3 notes
35 pages
Decision Tree Learning and Inductive Inference
No ratings yet
Decision Tree Learning and Inductive Inference
37 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
Entropy and Information Gain Explained
No ratings yet
Entropy and Information Gain Explained
10 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
ML_Unit-2_Material
No ratings yet
ML_Unit-2_Material
20 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
12 pages
Module 3 DecisionTree Notes
100% (1)
Module 3 DecisionTree Notes
14 pages
Unit 6 Finalized
No ratings yet
Unit 6 Finalized
30 pages
ML Unit-3 ppt
No ratings yet
ML Unit-3 ppt
92 pages
Examples
No ratings yet
Examples
8 pages
ID3 Algorithm
No ratings yet
ID3 Algorithm
22 pages
Saad Iqbal 301-211073 Assign 2
No ratings yet
Saad Iqbal 301-211073 Assign 2
6 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
3ID3 Algorithm
No ratings yet
3ID3 Algorithm
9 pages
Decisiontrees
No ratings yet
Decisiontrees
46 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
A Step by Step ID3 Decision Tree Example by Niranjan Kumar Das
No ratings yet
A Step by Step ID3 Decision Tree Example by Niranjan Kumar Das
8 pages
3
No ratings yet
3
3 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
unit 3
No ratings yet
unit 3
90 pages
ML intro
No ratings yet
ML intro
45 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Decision Tree Classifier-Introduction, ID3
No ratings yet
Decision Tree Classifier-Introduction, ID3
34 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
DM UNIT III (1)
No ratings yet
DM UNIT III (1)
87 pages
Lesson 5
No ratings yet
Lesson 5
28 pages
Machine Learning - Part 1
100% (1)
Machine Learning - Part 1
80 pages
3.1 C 4.5 Algorithm-19
No ratings yet
3.1 C 4.5 Algorithm-19
10 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
2.decision Tree
No ratings yet
2.decision Tree
56 pages
Naïve Bayes-DecisionTrees-RandomForest-SVM
No ratings yet
Naïve Bayes-DecisionTrees-RandomForest-SVM
26 pages
Decision Trees Notes
No ratings yet
Decision Trees Notes
11 pages
03-FSSR_DS610_2024=2025T1_DT
No ratings yet
03-FSSR_DS610_2024=2025T1_DT
51 pages
Decision Tree
No ratings yet
Decision Tree
34 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
weather forecasting example (2)
No ratings yet
weather forecasting example (2)
3 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Research The Impact of AI On The Banking Industry
No ratings yet
Research The Impact of AI On The Banking Industry
9 pages
BSC AI
No ratings yet
BSC AI
206 pages
Prudent Fraud Detection Using Machine Learning
No ratings yet
Prudent Fraud Detection Using Machine Learning
2 pages
SSRN Id4464555
No ratings yet
SSRN Id4464555
39 pages
A Machine Learning Framework For An Algorithmic Trading System PDF
No ratings yet
A Machine Learning Framework For An Algorithmic Trading System PDF
11 pages
Quantum Machine Learning
No ratings yet
Quantum Machine Learning
8 pages
Stealing Hyperparameters in Machine Learning: Binghui Wang and Neil Zhenqiang Gong
No ratings yet
Stealing Hyperparameters in Machine Learning: Binghui Wang and Neil Zhenqiang Gong
30 pages
Face Recognition Based Attendance System Using Machine Learning
No ratings yet
Face Recognition Based Attendance System Using Machine Learning
7 pages
Artificial Intelligence Education For Young Children
No ratings yet
Artificial Intelligence Education For Young Children
7 pages
Deep Learning-Based Structural Health Monitoring
No ratings yet
Deep Learning-Based Structural Health Monitoring
38 pages
(PR 2024) Lec1 Intro Regression I
No ratings yet
(PR 2024) Lec1 Intro Regression I
25 pages
Keras Tutorial Cheatsheet
No ratings yet
Keras Tutorial Cheatsheet
1 page
Tech Magazine 1 1
No ratings yet
Tech Magazine 1 1
67 pages
Instant Access To Deep Learning Vol 2 From Basics To Practice Andrew Glassner Ebook Full Chapters
100% (5)
Instant Access To Deep Learning Vol 2 From Basics To Practice Andrew Glassner Ebook Full Chapters
62 pages
Yoga Postures Correction and Estimation Using Open CV and VGG 19 Architecture
No ratings yet
Yoga Postures Correction and Estimation Using Open CV and VGG 19 Architecture
8 pages
Dimensionality Reduction-PCA FA LDA
No ratings yet
Dimensionality Reduction-PCA FA LDA
12 pages
A Tour of Data Science Learn R and Python in Parallel by Nailong Zhang
100% (3)
A Tour of Data Science Learn R and Python in Parallel by Nailong Zhang
217 pages
Mujtaba - Ali - DS - AlmaBetter - Mujtaba Ali
No ratings yet
Mujtaba - Ali - DS - AlmaBetter - Mujtaba Ali
1 page
Car Price Prediction Using Machine Learning Techniques
100% (1)
Car Price Prediction Using Machine Learning Techniques
6 pages
Individual Assignment 1 - STID3034
No ratings yet
Individual Assignment 1 - STID3034
6 pages
Definition - What Does Mean?: Artificial Intelligence (AI)
No ratings yet
Definition - What Does Mean?: Artificial Intelligence (AI)
3 pages
deSaLowande Rafael Thesis
No ratings yet
deSaLowande Rafael Thesis
85 pages
Chapter 07 Artificial Neural Network
No ratings yet
Chapter 07 Artificial Neural Network
62 pages
2 Machine Learning Overview v3.5
No ratings yet
2 Machine Learning Overview v3.5
95 pages
Comparative Analysis of Classification Algorithms On Diferrent Dataset Using Weka SW PDF
No ratings yet
Comparative Analysis of Classification Algorithms On Diferrent Dataset Using Weka SW PDF
5 pages
Unit 3 DVA
No ratings yet
Unit 3 DVA
50 pages
Introduction To Deep Learning - Assignment
No ratings yet
Introduction To Deep Learning - Assignment
4 pages
SCI - Volume 26 - Issue 5 - Pages 2689-2702
No ratings yet
SCI - Volume 26 - Issue 5 - Pages 2689-2702
14 pages
COM 114 Note-1
No ratings yet
COM 114 Note-1
18 pages
Slide 7 - Neural Networks
No ratings yet
Slide 7 - Neural Networks
64 pages