Iterative Dichotomiser 3 (ID3)
Algorithm
Medha Pradhan
CS 157B, Spring 2007
Agenda
Basics of Decision Tree
Introduction to ID3
Entropy and Information Gain
Two Examples
Basics
What is a decision tree?
A tree where each branching
(decision) node represents a
choice between 2 or more
alternatives, with every
branching node being part
of a path to a leaf node
Decision node: Specifies a
test of some attribute
Leaf node: Indicates
classification of an example
ID3
Invented by J. Ross Quinlan
Employs a top-down greedy search through
the space of possible decision trees.
Greedy because there is no backtracking. It
picks highest values first.
Select attribute that is most useful for
classifying examples (attribute that has the
highest Information Gain).
Entropy
Entropy measures the impurity of an arbitrary collection of
examples.
For a collection S, entropy is given as:
For a collection S having positive and negative examples
Entropy(S) = -p+log2p+ - p-log2p-
where p+ is the proportion of positive examples
and p- is the proportion of negative examples
In general, Entropy(S) = 0 if all members of S belong to the same
class.
Entropy(S) = 1 (maximum) when all members are split equally.
Information Gain
Measures the expected reduction in entropy. The
higher the IG, more is the expected reduction in
entropy.
where Values(A) is the set of all possible values
for attribute A,
Sv is the subset of S for which attribute A has
value v.
Example 1
Sample training data to determine whether an animal lays
eggs.
Dependent/
Independent/Condition attributes Decision
attributes
Animal Warm- Feathers Fur Swims Lays Eggs
blooded
Ostrich Yes Yes No No Yes
Crocodile No No No Yes Yes
Raven Yes Yes No No Yes
Albatross Yes Yes No No Yes
Dolphin Yes No No Yes No
Koala Yes No Yes No No
Entropy(4Y,2N): -(4/6)log2(4/6) – (2/6)log2(2/6)
= 0.91829
Now, we have to find the IG for all four attributes
Warm-blooded, Feathers, Fur, Swims
For attribute ‘Warm-blooded’:
Values(Warm-blooded) : [Yes,No]
S = [4Y,2N]
SYes = [3Y,2N] E(SYes) = 0.97095
SNo = [1Y,0N] E(SNo) = 0 (all members belong to same class)
Gain(S,Warm-blooded) = 0.91829 – [(5/6)*0.97095 + (1/6)*0]
= 0.10916
For attribute ‘Feathers’:
Values(Feathers) : [Yes,No]
S = [4Y,2N]
SYes = [3Y,0N] E(SYes) = 0
SNo = [1Y,2N] E(SNo) = 0.91829
Gain(S,Feathers) = 0.91829 – [(3/6)*0 + (3/6)*0.91829]
= 0.45914
For attribute ‘Fur’:
Values(Fur) : [Yes,No]
S = [4Y,2N]
SYes = [0Y,1N] E(SYes) = 0
SNo = [4Y,1N] E(SNo) = 0.7219
Gain(S,Fur) = 0.91829 – [(1/6)*0 + (5/6)*0.7219]
= 0.3167
For attribute ‘Swims’:
Values(Swims) : [Yes,No]
S = [4Y,2N]
SYes = [1Y,1N] E(SYes) = 1 (equal members in both classes)
SNo = [3Y,1N] E(SNo) = 0.81127
Gain(S,Swims) = 0.91829 – [(2/6)*1 + (4/6)*0.81127]
Gain(S,Warm-blooded) = 0.10916
Gain(S,Feathers) = 0.45914
Gain(S,Fur) = 0.31670
Gain(S,Swims) = 0.04411
Gain(S,Feathers) is maximum, so it is considered as the root node
Anim War Feath Fur Swim Lays
The ‘Y’ descendant has only
al m- ers s Eggs positive examples and becomes the
blood leaf node with classification ‘Lays
ed Eggs’
Ostric Yes Yes No No Yes Feathers
h
Croco No No No Yes Yes Y N
dile
Raven Yes Yes No No Yes
[Ostrich, Raven, [Crocodile, Dolphin,
Albatr Yes Yes No No Yes
oss Albatross] Koala]
Dolph Yes No No Yes No
in Lays Eggs ?
Koala Yes No Yes No No
Animal Warm- Feathers Fur Swims Lays Eggs
blooded
Crocodile No No No Yes Yes
Dolphin Yes No No Yes No
Koala Yes No Yes No No
We now repeat the procedure,
S: [Crocodile, Dolphin, Koala]
S: [1+,2-]
Entropy(S) = -(1/3)log2(1/3) – (2/3)log2(2/3)
= 0.91829
For attribute ‘Warm-blooded’:
Values(Warm-blooded) : [Yes,No]
S = [1Y,2N]
SYes = [0Y,2N] E(SYes) = 0
SNo = [1Y,0N] E(SNo) = 0
Gain(S,Warm-blooded) = 0.91829 – [(2/3)*0 + (1/3)*0] = 0.91829
For attribute ‘Fur’:
Values(Fur) : [Yes,No]
S = [1Y,2N]
SYes = [0Y,1N] E(SYes) = 0
SNo = [1Y,1N] E(SNo) = 1
Gain(S,Fur) = 0.91829 – [(1/3)*0 + (2/3)*1] = 0.25162
For attribute ‘Swims’:
Values(Swims) : [Yes,No]
S = [1Y,2N]
SYes = [1Y,1N] E(SYes) = 1
SNo = [0Y,1N] E(SNo) = 0
Gain(S,Swims) = 0.91829 – [(2/3)*1 + (1/3)*0] = 0.25162
The final decision tree will be:
Feathers
Y N
Lays eggs Warm-blooded
Y N
Does not lay eggs Lays Eggs
Example 2
Factors affecting sunburn
Name Hair Height Weight Lotion Sunburned
Sarah Blonde Average Light No Yes
Dana Blonde Tall Average Yes No
Alex Brown Short Average Yes No
Annie Blonde Short Average No Yes
Emily Red Average Heavy No Yes
Pete Brown Tall Heavy No No
John Brown Average Heavy No No
Katie Blonde Short Light Yes No
S = [3+, 5-]
Entropy(S) = -(3/8)log2(3/8) – (5/8)log2(5/8)
= 0.95443
Find IG for all 4 attributes: Hair, Height, Weight, Lotion
For attribute ‘Hair’:
Values(Hair) : [Blonde, Brown, Red]
S = [3+,5-]
SBlonde = [2+,2-] E(SBlonde) = 1
SBrown = [0+,3-] E(SBrown) = 0
SRed = [1+,0-] E(SRed) = 0
Gain(S,Hair) = 0.95443 – [(4/8)*1 + (3/8)*0 + (1/8)*0]
= 0.45443
For attribute ‘Height’:
Values(Height) : [Average, Tall, Short]
SAverage = [2+,1-] E(SAverage) = 0.91829
STall = [0+,2-] E(STall) = 0
SShort = [1+,2-] E(SShort) = 0.91829
Gain(S,Height) = 0.95443 – [(3/8)*0.91829 + (2/8)*0 + (3/8)*0.91829]
= 0.26571
For attribute ‘Weight’:
Values(Weight) : [Light, Average, Heavy]
SLight = [1+,1-] E(SLight) = 1
SAverage = [1+,2-] E(SAverage) = 0.91829
SHeavy = [1+,2-] E(SHeavy) = 0.91829
Gain(S,Weight) = 0.95443 – [(2/8)*1 + (3/8)*0.91829 + (3/8)*0.91829]
= 0.01571
For attribute ‘Lotion’:
Values(Lotion) : [Yes, No]
SYes = [0+,3-] E(SYes) = 0
SNo = [3+,2-] E(SNo) = 0.97095
Gain(S,Lotion) = 0.95443 – [(3/8)*0 + (5/8)*0.97095]
Gain(S,Hair) = 0.45443
Gain(S,Height) = 0.26571
Gain(S,Weight) = 0.01571
Gain(S,Lotion) = 0.3475
Gain(S,Hair) is maximum, so it is considered as the root node
Name Hair Height Weigh Lotion Sunbur
t ned
Sarah Blonde Averag Light No Yes
e
Dana Blonde Tall Averag Yes No Hair
e
Blonde Brown
Alex Brown Short Averag Yes No
e Red
[Sarah, Dana, [Alex, Pete, John]
Annie Blonde Short Averag No Yes
e Annie, Katie]
Not
Emily Red Averag Heavy No Yes ? Sunburned
e
Pete Brown Tall Heavy No No
[Emily]
John Brown Averag Heavy No No
e Sunburned
Katie Blonde Short Light Yes No
Name Hair Height Weight Lotion Sunburned
Sarah Blonde Average Light No Yes
Dana Blonde Tall Average Yes No
Annie Blonde Short Average No Yes
Katie Blonde Short Light Yes No
Repeating again:
S = [Sarah, Dana, Annie, Katie]
S: [2+,2-]
Entropy(S) = 1
Find IG for remaining 3 attributes Height, Weight, Lotion
For attribute ‘Height’:
Values(Height) : [Average, Tall, Short]
S = [2+,2-]
SAverage = [1+,0-] E(SAverage) = 0
STall = [0+,1-] E(STall) = 0
SShort = [1+,1-] E(SShort) = 1
Gain(S,Height) = 1 – [(1/4)*0 + (1/4)*0 + (2/4)*1]
= 0.5
For attribute ‘Weight’:
Values(Weight) : [Average, Light]
S = [2+,2-]
SAverage = [1+,1-] E(SAverage) = 1
SLight = [1+,1-] E(SLight) = 1
Gain(S,Weight) = 1 – [(2/4)*1 + (2/4)*1]
=0
For attribute ‘Lotion’:
Values(Lotion) : [Yes, No]
S = [2+,2-]
SYes = [0+,2-] E(SYes) = 0
SNo = [2+,0-] E(SNo) = 0
Gain(S,Lotion) = 1 – [(2/4)*0 + (2/4)*0]
=1
In this case, the final decision tree will be
Hair
Blonde Brown
Red
Sunburned Not
Lotion
Sunburned
Y N
Not Sunburned
Sunburned
References
"Machine Learning", by Tom Mitchell, McGraw-Hill, 1997
"Building Decision Trees with the ID3 Algorithm", by:
Andrew Colin, Dr. Dobbs Journal, June 1996
http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/dt_pro
b1.html
Professor Sin-Min Lee, SJSU.
http://cs.sjsu.edu/~lee/cs157b/cs157b.html