0% found this document useful (0 votes)

94 views13 pages

Decision Trees and Random Forests Explained

Decision-Tree-Random-Forest-Theory, Machine Learning

Uploaded by

any.pc2500

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views13 pages

Decision Trees and Random Forests Explained

Decision-Tree-Random-Forest-Theory, Machine Learning

Uploaded by

any.pc2500

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

W hat is a D ecision Tree?

A decision tree is a predictive model that uses a flowchart-like structure to make

decisions based on input data. It divides data into branches and assigns
outcomes to leaf nodes. Decision trees are used for classification and regression
tasks, providing easy-to-understand models.

A decision tree is a hierarchical model used in decision support that depicts

decisions and their potential outcomes, incorporating chance events, resource
expenses, and utility. The tree structure is comprised of a root node, branches,
internal nodes, and leaf nodes, forming a hierarchical, tree-like structure.

It is a tool that has applications spanning several different areas. The name
itself suggests that it uses a flowchart like a tree structure to show the
predictions that result from a series of feature-based splits. It starts with a root
node and ends with a decision made by leaves.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
D ecision Tree Term inologies
Before learning more about decision trees let’s get familiar with some of the
terminologies:

• R oot N odes – It is the node present at the beginning of a decision tree

from this node the population starts dividing according to various
features.
• D ecision N odes – the nodes we get after splitting the root nodes are
called Decision Node
• Leaf N odes – the nodes where further splitting is not possible are called
leaf nodes or term inal nodes.
• Sub-tree – just like a small portion of a graph is called sub-graph
similarly a sub-section of this decision tree is called sub-tree.
• P runing – is nothing but cutting down some nodes to stop overfitting.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
Exam ple of D ecision Tree
Let’s understand decision trees with the help of an example:

D ay W eather Tem perature H um idity W ind P lay?

1 Sunny Hot High Weak No
2 Cloudy Hot High Weak Yes
3 Sunny Mild Normal Strong Yes
4 Cloudy Mild High Strong Yes
5 Rainy Mild High Strong No
6 Rainy Cool Normal Strong No
7 Rainy Mild High Weak Yes
8 Sunny Hot High Strong No
9 Cloudy Hot Normal Weak Yes
10 Rainy Mild High Strong No

Decision trees are upside down which means the root is at the top and then this
root is split into various several nodes. Decision trees are nothing but a bunch
of if-else statements in layman terms. It checks if the condition is true and if it
is then it goes to the next node attached to that decision.

In the below diagram the tree will first ask what is the weather? Is it sunny,
cloudy, or rainy? If yes, then it will go to the next feature which is humidity
and wind. It will again check if there is a strong wind or weak, if it’s a weak
wind and it’s rainy then the person may go and play.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
Did you notice anything in the above flowchart? We see that if the weather is
cloudy then we must go to play. Why didn’t it split more? Why did it stop
there?

To answer this question, we need to know about few more concepts like
entropy, inform ation gain, and G ini index. But in simple terms, I can say
here that the output for the training dataset is always yes for cloudy weather,
since there is no disorderliness here we don’t need to split the node further.

The goal of machine learning is to decrease uncertainty or disorders from the

dataset and for this, we use decision trees.

Now you must be thinking how do I know what should be the root node? what
should be the decision node? when should I stop splitting? To decide this, there
is a metric called “Entropy” which is the amount of uncertainty in the dataset.

Entropy
Entropy is nothing but the uncertainty in our dataset or measure of disorder.
Let me try to explain this with the help of an example.

Suppose you have a group of friends who decides which movie they can watch
together on Sunday. There are 2 choices for movies, one is “Lucy” and the
second is “Titanic” and now everyone has to tell their choice. After everyone
gives their answer we see that “Lucy” gets 4 votes and “Titanic” gets 5 votes.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
Which movie do we watch now? Isn’t it hard to choose 1 movie now because
the votes for both the movies are somewhat equal.

This is exactly what we call disorderness, there is an equal number of votes for
both the movies, and we can’t really decide which movie we should watch. It
would have been much easier if the votes for “Lucy” were 8 and for “Titanic” it
was 2. Here we could easily say that the majority of votes are for “Lucy” hence
everyone will be watching this movie.

In a decision tree, the output is mostly “yes” or “no”

The formula for Entropy is shown below:

Here p+ is the probability of positive class.

p– is the probability of negative class.

S is the subset of the training example

H ow do D ecision Trees use Entropy?

Now we know what entropy is and what is its formula, Next, we need to know
that how exactly does it work in this algorithm.

Entropy basically measures the impurity of a node. Impurity is the degree of

randomness; it tells how random our data is. A pure sub-split means that
either you should be getting “yes”, or you should be getting “no”.

Suppose a feature has 8 “yes” and 4 “no” initially, after the first split the left
node gets 5 ‘yes’ and 2 ‘no’ whereas right node gets 3 ‘yes’ and 2 ‘no’.

We see here the split is not pure, why? Because we can still see some negative
classes in both the nodes. To make a decision tree, we need to calculate the
impurity of each split, and when the purity is 100%, we make it as a leaf node.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
To check the impurity of feature 2 and feature 3 we will take the help for
Entropy formula.

For Feature 2,

For feature 3,

We can clearly see from the tree itself that left node has low entropy or m ore
purity than right node since left node has a greater number of “yes” and it is
easy to decide here.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
Always remember that the higher the Entropy, the low er will be the
purity.

As mentioned earlier the goal of machine learning is to decrease the uncertainty

or impurity in the dataset, here by using the entropy we are getting the impurity
of a particular node, we don’t know if the parent entropy or the entropy of a
particular node has decreased or not.

For this, we bring a new metric called “Information gain” which tells us how
much the parent entropy has decreased after splitting it with some feature.

Inform ation G ain

Information gain measures the reduction of uncertainty given some feature and
it is also a deciding factor for which attribute should be selected as a decision
node or root node.

It is just entropy of the full dataset – entropy of the dataset given

som e feature.

To understand this better let’s consider an example: Suppose our entire

population has a total of 30 instances. The dataset is to predict whether the
person will go to the gym or not. Let’s say 16 people go to the gym and 14
people don’t

Now we have two features to predict whether he/she will go to the gym or not.

Feature 1 is “Energy” which takes two values “high” and “low”

Feature 2 is “M otivation” which takes 3 values “No motivation”, “Neutral” and

“Highly motivated”.

Let’s see how our decision tree will be made using these 2 features. We’ll use
information gain to decide which feature should be the root node and which
feature should be placed after the split.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
Let’s calculate the entropy

To see the weighted average of entropy of each node we will do as follows:

Now we have the value of E(Parent) and E(Parent|Energy), information gain

will be:

Our parent entropy was near 0.99 and after looking at this value of information
gain, we can say that the entropy of the dataset will decrease by 0.37 if we make
“Energy” as our root node.

Similarly, we will do this with the other feature “Motivation” and calculate its
information gain.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
Let’s calculate the entropy here:

To see the weighted average of entropy of each node we will do as follows:

Now we have the value of E(Parent) and E(Parent|Motivation), information

gain will be:

We now see that the “Energy” feature gives more reduction which is 0.37 than
the “Motivation” feature. Hence we will select the feature which has the highest
information gain and then split the node based on that feature.

In this example “Energy” will be our root node and we’ll do the same for sub-
nodes. Here we can see that when the energy is “high” the entropy is low

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
and hence we can say a person will definitely go to the gym if he has high
energy, but what if the energy is low? We will again split the node based on the
new feature which is “Motivation”.

When to Stop Splitting?

You must be asking this question to yourself that when do we stop growing our
tree? Usually, real-world datasets have a large number of features, which will
result in a large number of splits, which in turn gives a huge tree. Such trees
take time to build and can lead to overfitting. That means the tree will give
very good accuracy on the training dataset but will give bad accuracy in test
data.

There are many ways to tackle this problem through hyperparameter tuning.
We can set the maximum depth of our decision tree using the
m ax_ depth parameter. The more the value of m ax_ depth, the more complex
your tree will be. The training error will off-course decrease if we increase
the m ax_ depth value but when our test data comes into the picture, we will
get a very bad accuracy. Hence you need a value that will not overfit as well as
underfit our data and for this, you can use GridSearchCV.

Another way is to set the minimum number of samples for each spilt. It is
denoted by m in_ sam ples_ split. Here we specify the minimum number of
samples required to do a spilt. For example, we can use a minimum of 10 samples
to reach a decision. That means if a node has less than 10 samples then using
this parameter, we can stop the further splitting of this node and make it a leaf
node.

There are more hyperparameters such as :

• m in_ sam ples_ leaf – represents the minimum number of samples

required to be in the leaf node. The more you increase the number, the
more is the possibility of overfitting.
• m ax_ features – it helps us decide what number of features to consider
when looking for the best split.

To read more about these hyperparameters you can read it here.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
Pruning
Pruning is another method that can help us avoid overfitting. It helps in
improving the performance of the tree by cutting the nodes or sub-nodes which
are not significant. Additionally, it removes the branches which have very low
importance.

There are mainly 2 ways for pruning:

• P re-pruning – we can stop growing the tree earlier, which means we can
prune/remove/cut a node if it has low importance w hile grow ing the
tree.
• P ost-pruning – once our tree is built to its depth , we can start
pruning the nodes based on their significance.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
Random Forest algorithm
The Random Forest algorithm is a popular machine learning technique that is
used for both classification and regression tasks. Popular decision tree algorithms
include Random Forest, ID3, C4.5 and CART. Random Forest is considered
one of the best algorithms as it combines multiple decision trees to improve
accuracy and reduce overfitting. It is an ensemble learning method that combines
multiple decision trees to make predictions.
Here's a simple explanation of how the Random Forest algorithm works:
1. D ata P reparation : The algorithm requires a labeled dataset, where the input
features (attributes) are used to predict a target variable. The dataset is divided
into a training set and a testing set.
2. B uilding D ecision Trees: Random Forest creates a collection of decision trees.
Each decision tree is built using a random subset of the training data. This random
subset is known as a bootstrap sample. Additionally, at each node of the decision
tree, only a random subset of the features is considered for splitting the data.
3. Voting M echanism : To make a prediction, the Random Forest algorithm
combines the predictions of all the decision trees. For classification tasks, the most
common class predicted by the trees is chosen as the final prediction. For regression
tasks, the average of all the predictions is taken.
4. B agging and R andom ness: Random Forest introduces two key concepts -
bagging and randomness. Bagging refers to the process of creating multiple decision
trees on different subsets of the data. Randomness is introduced by considering
random subsets of features at each node, which helps to reduce overfitting and
improve the model's generalization.
5. P redictions and Evaluation : Once the Random Forest model is trained, it
can be used to make predictions on the testing data. The accuracy of the
predictions can be evaluated using various performance metrics such as accuracy,
precision, recall, or mean squared error, depending on the problem type.
Some key advantages of the Random Forest algorithm include its ability to handle
large datasets with high dimensionality, robustness against overfitting, and good
performance in both classification and regression tasks. However, it may not be as
interpretable as individual decision trees.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest
For example: consider the fruit basket as the data as shown in the figure below.
Now n number of samples are taken from the fruit basket, and an individual
decision tree is constructed for each sample. Each decision tree will generate an
output, as shown in the figure. The final output is considered based on majority
voting. In the below figure, you can see that the majority decision tree gives output
as an apple when compared to a banana, so the final output is taken as an apple.

Important Features of Random Forest

• D iversity: Not all attributes/variables/features are considered while
making an individual tree; each tree is different.
• Im m une to the curse of dim ensionality: Since each tree does not
consider all the features, the feature space is reduced.
• P arallelization: Each tree is created independently out of different data
and attributes. This means we can fully use the CPU to build random forests.
• Stability: Stability arises because the result is based on majority voting/
averaging.

ICT 4102 Artificial Intelligence Lab – Decision Tree & Random Forest

Decision Tree Algorithm Explained
No ratings yet
Decision Tree Algorithm Explained
13 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
26 pages
ML Unit-3
No ratings yet
ML Unit-3
20 pages
ML Unit 2 Shashi
No ratings yet
ML Unit 2 Shashi
49 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
31 pages
Machine Learning: Decision Tree Overview
No ratings yet
Machine Learning: Decision Tree Overview
24 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
40 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
47 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
12 pages
Decision Trees: Overview and Examples
No ratings yet
Decision Trees: Overview and Examples
22 pages
Python Decision Tree Classifier Guide
No ratings yet
Python Decision Tree Classifier Guide
10 pages
Understanding Decision Tree Algorithms
No ratings yet
Understanding Decision Tree Algorithms
39 pages
Understanding Decision Trees
No ratings yet
Understanding Decision Trees
45 pages
Understanding Decision Trees in ML
0% (1)
Understanding Decision Trees in ML
16 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
24 pages
Supervised Learning: Decision Trees Explained
No ratings yet
Supervised Learning: Decision Trees Explained
42 pages
Decision Trees in Data Science Explained
No ratings yet
Decision Trees in Data Science Explained
34 pages
MLT UNIT-3 Notes
No ratings yet
MLT UNIT-3 Notes
35 pages
Understanding Decision Tree Algorithms
No ratings yet
Understanding Decision Tree Algorithms
85 pages
Decision Trees in AI and ML
No ratings yet
Decision Trees in AI and ML
22 pages
What Is Decision Tree
No ratings yet
What Is Decision Tree
8 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
118 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
45 pages
Advantages and Disadvantages of Random Forest
No ratings yet
Advantages and Disadvantages of Random Forest
22 pages
Decision Trees: Purity and Entropy Explained
No ratings yet
Decision Trees: Purity and Entropy Explained
52 pages
Understanding Decision Tree Learning
No ratings yet
Understanding Decision Tree Learning
16 pages
C4.5 Decision Tree Algorithm Overview
No ratings yet
C4.5 Decision Tree Algorithm Overview
58 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
42 pages
DTSC Sem4 Eng
No ratings yet
DTSC Sem4 Eng
100 pages
Decision Tree Basics and Algorithms
No ratings yet
Decision Tree Basics and Algorithms
117 pages
Decision Tree Algorithms in Machine Learning
No ratings yet
Decision Tree Algorithms in Machine Learning
54 pages
Overview of Decision Tree Algorithms
No ratings yet
Overview of Decision Tree Algorithms
19 pages
Decision Trees and Probabilistic Models
No ratings yet
Decision Trees and Probabilistic Models
25 pages
Understanding Decision Trees and Splits
No ratings yet
Understanding Decision Trees and Splits
16 pages
Understanding Decision Trees in Machine Learning
No ratings yet
Understanding Decision Trees in Machine Learning
45 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
46 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
40 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
11 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
88 pages
Decision Tree DR - Shatha
No ratings yet
Decision Tree DR - Shatha
42 pages
Decision Tree Classification Explained
No ratings yet
Decision Tree Classification Explained
7 pages
Decision Trees in Machine Learning Guide
No ratings yet
Decision Trees in Machine Learning Guide
8 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
54 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
24 pages
Decision Tree Basics in Machine Learning
No ratings yet
Decision Tree Basics in Machine Learning
48 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
10 pages
Decision Tree Classification Overview
No ratings yet
Decision Tree Classification Overview
61 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
41 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
8 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
28 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
27 pages
Decision Tree Analysis with Play Tennis Dataset
No ratings yet
Decision Tree Analysis with Play Tennis Dataset
51 pages
Measuring Node Impurity in Decision Trees
No ratings yet
Measuring Node Impurity in Decision Trees
26 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
23 pages
Understanding OOB Error in Random Forests
No ratings yet
Understanding OOB Error in Random Forests
22 pages
6.4 Decision Trees
No ratings yet
6.4 Decision Trees
72 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
47 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
24 pages
Decision Trees in Classification Explained
No ratings yet
Decision Trees in Classification Explained
22 pages
Air Navigation Order on Operational Control
No ratings yet
Air Navigation Order on Operational Control
19 pages
Daily Monitoring and Risk Management Strategies
No ratings yet
Daily Monitoring and Risk Management Strategies
27 pages
Import Excel Data to SQL with C#
No ratings yet
Import Excel Data to SQL with C#
14 pages
Classic Movies Collection Overview
No ratings yet
Classic Movies Collection Overview
85 pages
Calculus-I Practical: Real Numbers and Proofs
No ratings yet
Calculus-I Practical: Real Numbers and Proofs
1 page
Certificate of Individual Microentrepreneur
No ratings yet
Certificate of Individual Microentrepreneur
2 pages
Safety Data Sheet: 3M™ Unitek™ Transbond™ XT Light Cure Adhesive Kit (712-030, 712-035)
No ratings yet
Safety Data Sheet: 3M™ Unitek™ Transbond™ XT Light Cure Adhesive Kit (712-030, 712-035)
21 pages
Metrology and Measurements Lab Manual
0% (1)
Metrology and Measurements Lab Manual
65 pages
VFS Global Appointment Confirmation
No ratings yet
VFS Global Appointment Confirmation
2 pages
Panchsheel: Five Principles Overview
No ratings yet
Panchsheel: Five Principles Overview
2 pages
Drama in a Taxi Cab: "531" Script
No ratings yet
Drama in a Taxi Cab: "531" Script
43 pages
Summer Holiday Homework Guidelines
No ratings yet
Summer Holiday Homework Guidelines
7 pages
Understanding Cardiovascular Disorders
No ratings yet
Understanding Cardiovascular Disorders
59 pages
Modbus Serial Link Connection Accessories
No ratings yet
Modbus Serial Link Connection Accessories
1 page
iSwing v4.1 Forex Trading Robot Guide
0% (1)
iSwing v4.1 Forex Trading Robot Guide
7 pages
PSAD Civil Engineering Exam Questions
No ratings yet
PSAD Civil Engineering Exam Questions
22 pages
Size Separation Techniques in Pharmacy
No ratings yet
Size Separation Techniques in Pharmacy
21 pages
NIS2 to CIS Controls Mapping Guide
No ratings yet
NIS2 to CIS Controls Mapping Guide
228 pages
Decimal Operations: 10, 100, 1000
No ratings yet
Decimal Operations: 10, 100, 1000
3 pages
Translation Methods Overview
No ratings yet
Translation Methods Overview
13 pages
8086 Assembly Program for Average & Variance
No ratings yet
8086 Assembly Program for Average & Variance
3 pages
Uganda's Government Structure Explained
No ratings yet
Uganda's Government Structure Explained
7 pages
UPI Transactions Statement for Mr. Sharma
No ratings yet
UPI Transactions Statement for Mr. Sharma
5 pages
Free Scribd Downloads Without Trial
No ratings yet
Free Scribd Downloads Without Trial
2 pages
MSS SP 70.98-Cast Iron Gate Valves - (1998)
No ratings yet
MSS SP 70.98-Cast Iron Gate Valves - (1998)
13 pages
Media and Information Trends Overview
No ratings yet
Media and Information Trends Overview
29 pages
Us Government Final Project
No ratings yet
Us Government Final Project
2 pages
Shariah Compliant Stock Valuation Guide
No ratings yet
Shariah Compliant Stock Valuation Guide
72 pages
High-Performance Epoxy Chemical Anchor
No ratings yet
High-Performance Epoxy Chemical Anchor
3 pages
HDFC Bank Service Quality Analysis Report
No ratings yet
HDFC Bank Service Quality Analysis Report
34 pages