0% found this document useful (0 votes)
10 views10 pages

10b Understanding Entropy Information Gain

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views10 pages

10b Understanding Entropy Information Gain

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Entropy

• A decision tree is built top-down from a root node and


involves partitioning the data into subsets that contain
instances with similar values (homogenous).
• ID3 algorithm uses entropy to calculate the homogeneity
of a sample. If the sample is completely homogeneous
the entropy is zero and if the sample is an equally divided
it has entropy of one.
Entropy- example
To build a decision tree, we need to calculate two types
of entropy using frequency tables-
Entropy using the frequency table of one attribute
Entropy using the frequency table of two
attributes
Information Gain
• The information gain is based on the decrease in entropy
after a dataset is split on an attribute. Constructing a
decision tree is all about finding attribute that returns the
highest information gain (i.e., the most homogeneous
branches).

• Step 1: Calculate entropy of the target.


Information Gain- example
Step 2: The dataset is then split on the different attributes.
The entropy for each branch is calculated. Then it is
added proportionally, to get total entropy for the split.
The resulting entropy is subtracted from the entropy
before the split. The result is the Information Gain, or
decrease in entropy.
Information Gain- example
Step 3: Choose attribute with the largest information gain
as the decision node, divide the dataset by its branches
and repeat the same process on every branch.
Information Gain- example
• Step 4a: A branch with entropy of 0 is a leaf node.
Information Gain- example
Step 4b: A branch with entropy more than 0 needs further
splitting.
Q.1 Define Entropy?

Q.2 State examples of Entropy in DT?

Q.3 Define Information Gain?

Q.4 State some examples of Information Gain ?

Q.5 How Entropy & Information Gain are


co-related?

You might also like