0% found this document useful (0 votes)
44 views

Demisew Presentation 1

An Improved Decision Tree Algorithm Presentation

Uploaded by

Demisew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Demisew Presentation 1

An Improved Decision Tree Algorithm Presentation

Uploaded by

Demisew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Bule Hora University

College Of Informatics
Department of Information Technology
Machine Learning(IT-5142)
Paper Presentation On
“An Improved ID3 Decision Tree Algorithm ”
Author:- Xiaojuan Chen1, a, Zhigang Zhang1, b, Yue Tong1
Year of Publication:- © 2019 The Authors. Published by Elsevier B.V.
Presented by:- Demisew Debisa
Summited to:- Dr. Beakal Gizachew Assefa(PHD)
April-25/2021, Bule Hora Ethiopia

1
This
Presentation
1 Slide
Includes…
2
1. Introduction
2. ID3 Algorithm
3 3. Improved ID3 Algorithm
4. Conclusion
5. Future Works
4
5
1. Introduction

 Decision tree is a classifier in the form of a


tree structure
 It represents classify instances or examples
by starting at the root of the tree and moving
through it until a leaf node.
 So This Slide presents an improved algorithm
based on the expectation information entropy
and Association Function instead of the
traditional information gain.
Illustration/Indicative

(1) Which to start? (root)

(2) Which node to proceed?

(3) When to stop/ come to conclusion?


2. ID3 Algorithm

 This algorithm is used for selecting the splitting by calculating


information gain.
 It is a classification algorithm based on Information Entropy.
What is Entropy?
 A measure of the uncertainty associated with a random variable.
 Values range from 0-1 to represent the entropy of information
ID3 Algorithm…

 Given a set S of positive and negative examples of some


target concept (a 2-class problem), the entropy of set S
relative to this binary classification is

E(S) = - p(P)log2 p(P) – p(N)log2 p(N)

Example:-
ID3 Algorithm…

Information Gain
 Is used as an attribute selection measure
 Pick the attribute that has the highest information Gain

D: A given data partition


A: Attribute
Example:

 Customer database of some shopping mall is shown in table 1(a training sample set).The
decision attribute named “Buy-computer” which can take two different values: buying-
computer or not buying-computer.
Example …

 Class P: buys_computer=“yes”
 Class N: buys_computer=“no”
According to ID3 formula(1), Compute the Entropy(S):
Entropy(S)= 9/14log2(9/14)-5/14log2(5/14)=0.940
Then information Gain of each condition attribute is:
Let us start with the attribute age
Gain( age, S)=Entropy(S) –

=0.246
Example …

 Gain( income, D)=0.0292 Gain(age)=0.2467


 Gain(Student, D)=0.1518
 Gain( credit rating, D)=0.0161
 Thus, the condition attribute age has the largest information gain. So
tree takes Age as the root node. Recursively compute, the ID3 decision
tree can be generated finally as followed in fig.1:
3. Improved ID3 Algorithm (AFI-ID3)

 Now we use the improved ID3 algorithm to apply to this database. By formula (4), the
relevance between condition attributes and decision attributes can be got.
Improved ID3 Algorithm (AFI-ID3) …

 So E(Credit)’= 0.3692 is the smallest, but the value of


attribute credit is not the max. Likewise, recursively compute,
the improved ID3 decision tree is the following
4. Conclusion

 The improved ID3 based on Association function which condition attributes


needs to be modified by computer objectively.
 Subjective evaluating using attribute-importance is overcome, and using
Association Function (AF) and adding the number of attributes values have
been improved.
5. Future Works

 Future research can examine how these improved algorithms have been applied
in real world scenarios and their adoption by researchers.
 Currently, it seems the improved algorithms are isolated and their usefulness
outside the research community cannot be ascertained.
 Furthermore, little research has been done on the use of evolutionary
algorithms for optimal feature selection, further work needs to be done in this
area as proper feature selection in large datasets can significantly improve the
performance of the algorithms.

You might also like