Tasks on Decision Trees
Tasks on Decision Trees
What is the entropy for a decision tree data-set with 6 positive and 4 negative examples.
Answer: A
Assignment tasks - Week 4 2021
N=100
Entropy =0.7
N=60 N=40
Entropy =0.4 Entropy =0.2
A. 0.26
B. 0.38
C. 0.42
D. 0.18
Answer: B
Learning of Decision Trees
We will focus on a particular category of learning techniques called
Top-Down Induction of Decision Trees (TDIDT).
The scenario for learning is supervised non-incremental data-driven learning from examples.
The systems are presented with a set of instances and develops a decision tree from the top down, guided
by frequency information in the examples. The trees are constructed beginning with the root of the tree
and proceeding down to its leaves.
The order in which instances are handled is not supposed to influence the build up of the trees.
The systems typically examine and re-examine all of the instances at many stages during learning.
Building the tree from from the top and downward, the issue is to choose and order features that
discriminate data-items (instances) in an optimal way.
Subtopics:
- Use of information theoretic measures to guide the selection and ordering of features
- Avoiding underfitting and overfitting by pruning of the tree
- Generation of several decision trees in parallel (e.g. random forest).
- Introduction of some kind of Inductive Bias (e.g Occam´s razor)
Purity or Homogeneity
The entire Data-set (all training instances) is associated
with the tree as a whole (the Root).
For binary classification: H(S) = −p (+) log2 p(+) − p (-) log2 p(-)
For n-ary classification: H(S) = - (all c in target classes):Sum (p(c) * log2 p (c))
Information Gain
Outlook has the highest Information Gain and is the preferred feature to discriminate
among data-items.
Solution Q4
H(S) = - 0.6 * log2 0.6 – 0.4 * log 2 0.4 = - ( 0.6*- 0.737 + 0.4* -1.322)
= 0.97
Solution Q5
What is the value of the Information Gain in the following partitioning?:
N=100
Entropy =0.7
N=60 N=40
Entropy =0.4 Entropy =0.2