18. Decision Tree
18. Decision Tree
-+-+
-+-+
---- +++
---- +++
5
Information Gain
6
Selecting the Next Attribute
S=[9+,5-] S=[9+,5-]
E(S)=0.940 E(S)=0.940
Humidity Wind
Over
Sunny Rain
cast
9
Selecting the Next Attribute
The information gain values for the 4 attributes
are:
• Gain(S,Outlook) =0.247
• Gain(S,Humidity) =0.151
• Gain(S,Wind) =0.048
• Gain(S,Temperature) =0.029
10
R
ID3 Algorithm
[D1,D2,…,D14] Outlook
[9+,5-]
No Yes No Yes
Arguments in favor:
• Fewer short hypotheses than long hypotheses
• A short hypothesis that fits the data is unlikely to be a coincidence
• A long hypothesis that fits the data might be a coincidence
Arguments opposed:
• There are many ways to define small sets of hypotheses
15