This document discusses decision trees and the ID3 algorithm for constructing decision trees from a set of training data. It explains that decision trees classify data by sorting it down the tree from the root node to a leaf node. The ID3 algorithm uses information gain to select the attribute to test at each node, choosing the attribute that best separates the training examples. It provides an example of using ID3 to build a decision tree to predict whether someone will play tennis based on weather attributes. The attribute with the highest information gain, outlook, is selected as the root node, and the algorithm is applied recursively to build the full tree.
Download as PPT, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
35 views
Dec Tree
This document discusses decision trees and the ID3 algorithm for constructing decision trees from a set of training data. It explains that decision trees classify data by sorting it down the tree from the root node to a leaf node. The ID3 algorithm uses information gain to select the attribute to test at each node, choosing the attribute that best separates the training examples. It provides an example of using ID3 to build a decision tree to predict whether someone will play tennis based on weather attributes. The attribute with the highest information gain, outlook, is selected as the root node, and the algorithm is applied recursively to build the full tree.
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 17
1
ID3 and Decision tree
2 Decision Trees Decision trees classify instances by sorting them down the tree from the root to some leaf node, which provides the classification of the instance. Each node in the tree specifies a test of some attribute of the instance, and each branch descending from that node corresponds to one of the possible values for this attribute
3 Why Decision Trees? They generalize in a better way for unobserved instances. They are efficient in computation as it is proportional to the number of training instances observed The tree interpretation gives a good understanding of how to classify instances based on attributes arranged on the basis of information they provide and makes the classification process self-evident. Algorithm in this area : ID3, C4.5, etc 4 ID3 algorithm Is an algorithm to construct a decision tree ID3 constructs decision tree by employing a top-down, greedy search through the given sets of training data to test each attribute at every node It uses statistical property call information gain to select which attribute to test at each node in the tree Information gain measures how well a given attribute separates the training examples according to their target classification. Using Entropy to generate the information gain The best value then be selected ID3 and Decision tree 5 Entropy A measure in the information theory which characterizes the impurity of an arbitrary collection of examples. The complete formula for entropy is:
E(S) = - (p + )*log 2 (p + ) - (p_)*log 2 (p_)
Where p + is the positive samples Where p_ is the negative samples Where S is the sample of attributions ID3 and Decision tree 6 Example A 1 =? True False [21+, 5-] [8+, 30-] [29+,35-] E(A) = -29/(29+35)*log2(29/(29+35)) 35/(35+29)log2(35/(35+29)) = 0.9937
E(FALSE) = -8/(8+30)*log2(8/(8+30)) 30/(30+8)*log2(30/(30+8)) = 0.7426 The Entropy of True: The Entropy of False: ID3 and Decision tree 7 Information Gain Gain (Sample, Attributes) or Gain (S,A) is expected reduction in entropy due to sorting S on attribute A
So, for the previous example, the Information gain is calculated: G(A1) = E(A1) - (21+5)/(29+35) * E(TRUE) - (8+30)/(29+35) * E(FALSE) = E(A1) - 26/64 * E(TRUE) - 38/64* E(FALSE) = 0.9937 26/64 * 0.796 38/64* 0.7426 = 0.5465 ID3 and Decision tree Gain(S,A) = Entropy(S) - vvalues(A) |S v |/|S| Entropy(S v ) 8 The complete example Day Outlook Temp. Humidity Wind Play Tennis D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Weak Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Strong Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No ID3 and Decision tree 9 Decision tree We want to build a decision tree for the tennis matches The schedule of matches depend on the weather (Outlook, Temperature, Humidity, and Wind) So to apply what we know to build a decision tree based on this table ID3 and Decision tree 10 Example Entropy (E) = - (9/14)log2(9/14) (5/14)log2(5/14) = 0.940 Calculating the information gains for each of the weather attributes: For the Temp For the Wind For the Humidity For the Outlook ID3 and Decision tree 11 For the Temp Gain(S,Temp): =0.940 - (4/14)*1.0 - (6/14)*0.918 (4/14)*0.811 =0.029
Temp Hot Cool [2+, 2-] [3+, 1-] S=[9+,5-] E=0.940 E=1.0 E=0.811 Mild [4+, 2] E=0.918 12 For the Wind Wind Weak Strong [6+, 2-] [3+, 3-] S=[9+,5-] E=0.940
Gain(S,Wind): =0.940 - (8/14)*0.811 - (6/14)*1.0 =0.048 ID3 and Decision tree 13 For the Humidity
Humidity High Normal [6+, 1-] S=[9+,5-] E=0.940 Gain(S,Humidity) =0.940-(7/14)*0.985 (7/14)*0.592 =0.151 [3+, 4-] ID3 and Decision tree 14 For the Outlook Outlook Sunny Rain [2+, 3-] [3+, 2-] S=[9+,5-] E=0.940 E=0.971 E=0.971 Over cast [4+, 0] E=0.0 Gain(S,Outlook) =0.940-(5/14)*0.971 -(4/14)*0.0 (5/14)*0.971 =0.247 ID3 and Decision tree 15 Choosing Attributes Select attribute with the maximum information gain, which is outlook, for splitting. Apply the algorithm to each child node of this root, until leaf nodes (node that has entropy = 0) are reached. 16 Complete tree Outlook Sunny Overcast Rain Humidity High Normal Wind Strong Weak No Yes Yes Yes No [D3,D7,D12,D13] [D8,D9,D11] [D6,D14] [D1,D2] ID3 and Decision tree 17 Reference: Dr. Lees Slides, San Jose State University, Spring 2007 "Building Decision Trees with the ID3 Algorithm", by: Andrew Colin, Dr. Dobbs Journal, June 1996 "Incremental Induction of Decision Trees", by Paul E. Utgoff, Kluwer Academic Publishers, 1989 http://www.cise.ufl.edu/~ddd/cap6635/Fall- 97/Short-papers/2.htm http://decisiontrees.net/node/27