Lecture 05 Random Forest 07112022 124639pm
Lecture 05 Random Forest 07112022 124639pm
LECTURER:
Humera Farooq, Ph.D.
Computer Sciences Department,
Bahria University (Karachi Campus)
Outline
RF as ensembler
Overfitting
Some experimental issues
Ensemble methods
A single decision tree does not perform well
But, it is super fast
What if we learn multiple trees?
• The main challenge is not to obtain highly accurate base models, but rather to obtain base
models which make different kinds of errors. For example, if ensembles are used for
classification, high accuracies can be accomplished if different base models misclassify
different training examples, even if the base classifier accuracy is low.
Bagging
Bagging (Bootstrap aggregating) is a method that result in low variance, usually applied to
decision tree methods.
1. It is a homogeneous weak learners’ model that learns from each other independently in
parallel. It is the case of the model averaging approach.
It is designed to improve the stability and accuracy of machine learning algorithms used in
statistical classification and regression.
If we split the data in random different ways, decision trees give different results, high
variance. Bagging decreases the variance and helps to avoid overfitting.
If we had multiple realizations of the data (or multiple samples) we could calculate the
predictions multiple times and take the average of the fact that averaging multiple onerous
(difficult) estimations produce less uncertain results.
The Random Forest model uses Bagging, where decision tree models with higher variance are
present. It makes random feature selection to grow trees. Several random trees make a
Random Forest.
Bagging
• Step 1: Multiple subsets are created from the original data set with equal tuples,
selecting observations with replacement.
• Step 2: A base model is created on each of these subsets.
• Step 3: Each model is learned in parallel with each training set and independent of
each other.
• Step 4: The final predictions are determined by combining the predictions from all
the models.
Variable Importance Measures
Bagging results in improved accuracy over prediction
using a single tree
Unfortunately, difficult to interpret the resulting model.
Bagging improves prediction accuracy at the expense of
interpretability.
Bagging - issues
Each tree is identically distributed (i.d.)
the expectation of the average of B such trees is the same as the expectation of any
one of them
the bias of bagged trees is the same as that of the individual trees
i.d. and not i.i.d (independently and identically distributed)
We can penalize the splitting (like in pruning) with a penalty term that depends on the
number of times a predictor is selected at a given length
The generalization error of a forest of tree classifiers depends on the strength of the
individual trees in the forest and the correlation between them.
Using a random selection of features to split each node yields error rates that
compare favorably to Adaboost, and are more robust with respect to noise.
Instead of relying on one decision tree, the random forest takes the prediction from
each tree and based on the majority votes of predictions, and it predicts the final
output.
The greater number of trees in the forest leads to higher accuracy and prevents the
problem of overfitting.
Random Forests Algorithm
For b = 1 to B:
(a) Draw a bootstrap sample Z∗ of size N from the training data.
(b) Grow a random-forest tree to the bootstrapped data, by recursively
repeating the following steps for each terminal node of the tree, until the minimum
node size nmin is reached.
i. Select m variables at random from the p variables.
ii. Pick the best variable/split-point among the m.
iii. Split the node into two daughter nodes.
Output the ensemble of trees.
In practice the best values for these parameters will depend on the problem, and they
should be treated as tuning parameters.
Random Forest Classifier
Create bootstrap samples
from the training data
M features
N examples
....…
Random Forest Classifier
Construct a decision tree
M features
N examples
....…
Random Forest Classifier
At each node in choosing the split feature
choose only among m<M features
M features
N examples
....…
Random Forest Classifier
Create decision tree
from each bootstrap sample
M features
N examples
....…
....…
Random Forest Classifier
M features
N examples
Take he
majority
vote
....…
....…
Aggregation
Example
4,718 genes measured on tissue samples from 349 patients.
Each gene has different expression
Each of the patient samples has a qualitative label with 15
different levels: either normal or 1 of 14 different types of
cancer.
Bagging: Generate multiple trees from bootstrapped data and average the trees.
Bagging results in i.d. trees and not i.i.d.
2. Provide this as input to the model and identify the wrongly classified data points.
3. Increase the weight of the wrongly classified data points and decrease the weights of
correctly classified data points. And then normalize the weights of all data points.
5. End
References
A Data Science Harvard University
https://www.javatpoint.com/machine-learning-random-forest-algorithm
https://www.geeksforgeeks.org/bagging-vs-boosting-in-machine-learning/