0% found this document useful (0 votes)

13 views41 pages

Ensemble Final

Machine Learning

Uploaded by

kushwinder kaur3912 BT

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views41 pages

Ensemble Final

Machine Learning

Uploaded by

kushwinder kaur3912 BT

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Ensemble Learning Techniques

Agenda

• Ensemble Learning
• Boosting
• Gradient Boosting and XGBoost
• Overfitting/Underfitting
• How to address Overfitting/Underfitting
Ensemble Learning

• Ensemble methods is a machine learning

technique that combines several base models
in order to produce one optimal predictive
model.
• The process of generating models from data is
called learning or training and the learned
model can be called as hypothesis or learner.
• This type of machine learning algorithm helps
in improving the overall performance of the
model.
• The learning algorithms which construct a set
of classifiers are known as Ensemble methods.
Ensemble Learning
Single Model Prediction vs Ensemble
Learner
Why Ensemble Methods?
• A diverse set of models in comparison to single models are likely
to make better decisions.
• A decision tree basically works on several rules and provides a
predictive output, where the rules are the nodes and their
decisions will be their children and the leaf nodes will constitute
the ultimate decision. The example of a decision tree below
about a bank loan decision.
One classiﬁer is not enough!

• Performance
– None of the classifiers is perfect
– Complementary
• Examples which are not correctly classified by one
classifier may be correctly classified by the other
classifiers
• Potential Improvements
– Utilize the complementary property
An EXAMPLE
Name Age Male? Height >
Male 55”
? Alice 14 0 1
Ye N
s o Bob 10 1 1
Age>9 Age>10 Carol 13 0 1
? ?
Ye N Ye N Dave 8 1 0
s o s o
1 0 1 0 Erin 11 0 0
Frank 9 1 1

Gena 8 0 0
Ensembles of Classiﬁers
Combine the classifiers to improve the
performance
Ensembles of Classifiers
– Two ways to combine the classification results
from different classifiers to produce the final
output
• Unweighted voting
• Weighted voting
Example: Weather Forecast

Reality
1 X X X
2 X X X
3 X X X
4 X X
5 X X
Combine
Type of Ensemble methods:
The three most popular methods for combining the
predictions from different models are:

• Bagging Building multiple models (typically of the

same type) from different subsamples of the
training dataset.

• Boosting. Building multiple models (typically of the

same type) each of which learns to fix the
prediction errors of a prior model in the chain.

• Building multiple models (typically of differing

types) and simple statistics (like calculating the
mean) are used to combine predictions.
Bias and Variance
• Bias is an error that occurs due to incorrect assumptions in
our algorithm; a high bias indicates our model is too
simple/underfit.
• Variance is the error that is caused due to sensitivity of the
model to very small fluctuations in the data set; a high
variance indicates our model is highly complex/overfit.
• An ideal ML model should have a proper balance between
bias and variance.
Ensemble methods

• Ensemble methods that minimize

variance
– Bagging
– Random Forests

• Ensemble methods that minimize bias

– Functional Gradient Descent
– Boosting
– Ensemble Selection
• Q.1 What is Ensemble Learning?

• Q.2 What is the need of ensemble learning

in ml?

• Q.3 Why only one classifier is not enough

in Machine Learning?

• Q.4 What are the types of Ensemble

Methods?
Bagging
Bootstrap AGGregating or BAGGing gets its name because it
combines Bootstrapping and Aggregation to form one
ensemble model.

• Given a sample of data, multiple subsamples are pulled

and a Decision Tree is formed on each of the subsamples.
• After that an algorithm is used to aggregate over the
Decision Trees to form the most efficient predictor.
• Once we have a prediction from each model then use a
model averaging technique to get the final prediction
output.
• One of the famous techniques used in Bagging is Random
Forest. In the Random forest, we use multiple decision
trees.
Bagging

Given a Dataset, subsamples are pulled and a Decision Tree is

formed on each bootstrapped sample. The results of each tree
are aggregated to yield the strongest, most accurate predictor.
Person Age Male? Height > 55”

James 11 1 1

Jessica 14 0 1
Person Age Male? Height > 55”
Alice 14 0 1

Amy 12 0 1 Alice 14 0 1
Bob 10 1 1

Xavier 9 1 0
Bob 10 1 1
Cathy 9 0 1

Carol 13 0 1
Carol 13 0 1
Eugene 13 1 0

Rafael 12 1 1

Dave 8 1 0
Dave 8 1 0
Peter 9 1 0

Henry 13 1 0
Erin 11 0 0
Erin 11 0 0

Rose 7 0 0 Frank 9 1 1
Iain 8 1 1

Paulo 12 1 0 Gena 8 0 0
Margaret 10 0 1

Frank 9 1 1

Jill 13 0 0

Leon 10 1 0
y h(
Sarah 12 0 0
Generalization x)
Gena 8 0 0

Patrick 5 1 1 L(h) = E(x,y)~P(x,y)[

Error:
…

f(h(x),y) ]
Boosting
• The term ‘Boosting’ refers to a family of algorithms
which converts weak learner to strong learners.
• The weak learner is the classifiers that are correct only up to a
small extent with the actual classification, while the strong
learners are the classifiers that are well correlated with the
actual classification.
• To find weak rule, we apply base learning (ML) algorithms with a
different distribution. Each time base learning algorithm is
applied, it generates a new weak prediction rule. After many
iterations, the boosting algorithm combines these weak rules into
a single strong prediction rule.
Boosting
Choosing different distribution for each round

In boosting we take records from the dataset and pass it to base

learners sequentially

• Suppose we have m number of records in the dataset. Then we

pass a few records to base learner BL1 and train it and then we
pass all the records from the dataset and see how the Base
learner works.

• For all the records which are classified incorrectly by the base
learner, we only take them and pass it to other base learner say
BL2 and simultaneously we pass the incorrect records classified
by BL2 to train BL3.

• This will go on unless and until we specify some specific number

of base learner models we need.

• Finally, we combine the output from these base learners and

create a strong learner, as a result, the prediction power of the
model gets improved.
Top advantages and
disadvantages
Advantages of Bagging
• Multiple weak learners can work better than a single strong
learner.
• It provides stability and increases the accuracy of the ML
algorithm that is used in classification and regression.
• It helps in reducing variance i.e. it avoids overfitting.

Disadvantages of Bagging
• It may result in high bias if it is not modelled properly and
thus may result in underfitting.
• Since we must use multiple models, it becomes
computationally expensive and may not be suitable in
various use cases.

Advantages of Boosting
• It is one of the most successful techniques in solving the
two-class classification problems.
• It is good at handling the missing data.

Disadvantages of Boosting
• Boosting is hard to implement in real-time due to the
increased complexity of the algorithm.
• High flexibility of this techniques results in a multiple
number of parameters than have a direct effect on the
behaviour of the model.
Types of Boosting Algorithms
• Gradient Tree Boosting
• XGBoost
• Q.1 What do you mean by Bagging?

• Q.2 What do you mean by Boosting?

• Q.3 What is the goal of boosting?

• Q.4 What are different methods of

Boosting?
Boosting Algorithm: Gradient Boosting

Gradient boosting is a technique for regression and

classification problems. The prediction model produced in the
form of an ensemble of weak prediction models.

The accuracy of a predictive model can be boosted in two ways:

a. Either by using feature engineering or
b. By applying boosting algorithms.

There are many boosting algorithms like

• Gradient Boosting
• XGBoost
• AdaBoost
Internal working of boosting algorithm
Gradient Boosting
Gradient boosting Algorithm involves three elements:
• A loss function to be optimized.
• Weak learner to make predictions.
• An additive model to add weak learners to minimize the loss function.
Extreme Gradient Boosting (XGBoost)
• XGBoost Algorithm is an implementation of gradient boosted decision
trees, designed for speed and performance.
• Basically, it is a type of software library. It can be used for supervised
learning tasks such as Regression, Classification, and Ranking.
• It is built on the principles of gradient boosting framework and designed to
“push the extreme of the computation limits of machines to provide
a scalable, portable and accurate library.”
System Feature- XGBoost
For use of a range of computing environments this library provides:
• Parallelization of tree construction using all of your CPU cores during
training.
• Distributed Computing for training very large models using a cluster of
machines & Out-of-Core Computing for very large datasets that don’t fit
into memory.
Comparison- XGBoosting
What is Bias?
• Bias is how far are the predicted values from the actual
values. If the average predicted values are far off from the
actual values then the bias is high.
• High bias causes algorithm to miss relevant relationship
between input and output variable. When a model has a high
bias then it implies that the model is too simple and does not
capture the complexity of data thus underfitting the data.
What is Variance ?
• Variance occurs when the model performs good on the
trained dataset but does not do well on a dataset that it is
not trained on, like a test dataset or validation
dataset. Variance tells us how scattered are the predicted
value from the actual value.
• High variance causes overfitting that implies that the
algorithm models random noise present in the training data.
What is Underﬁtting?
• A statistical model or a algorithm is said to have underfitting
when it cannot capture the underlying trend of the data.
• Underfitting destroys the accuracy of our machine learning
model.
• Its occurrence simply means that our model or the algorithm
does not fit the data well enough.
• It usually happens when we have less data to build an
accurate model and also when we try to build a linear model
with a non-linear data.
Underﬁtting
• Underfitting can be avoided by using more data and also
reducing the features by feature selection.

Underfitting – High bias and low variance

What is Overﬁtting?

• Overfitting refers to a model that models the training data

too well.
• Overfitting happens when a model ‘learns’ the detail and
noise in the training data to the extent that it try to ‘cheat’
predictions on new data.
• This means that the noise or random fluctuations in the
training data is picked up and learned as concepts by the
model.
Overﬁtting
• When a model gets trained with so much of data, it starts
learning from the noise and inaccurate data entries in our
data set.

Overfitting – High variance and low bias

How to reduce Overﬁtting?

Techniques to reduce overfitting :

1. Increase training data.

2. Reduce model complexity.

3. Early stopping during the training phase (have an eye

over the loss over the training period as soon as loss begins
to increase stop training).

4.Use dropout for neural networks to tackle overfitting.

How to reduce Underﬁtting?

Techniques to reduce underfitting

1. Increase model complexity

2. Increase number of features, performing feature

engineering

3. Remove noise from the data.

4. Increase the number of epochs or increase the duration

of training to get better results.
AdaBoost
• AdaBoost is short for Adaptive Boosting.
• It combines multiple classifiers to increase the accuracy
of classifiers.
• AdaBoost is an iterative ensemble method.
• AdaBoost classifier builds a strong classifier by combining
multiple poorly performing classifiers so that you will get
high accuracy strong classifier.
AdaBoosting
→ The weak learners in AdaBoost are decision trees with a
single split, called decision stumps.
→ AdaBoost works by putting more weight on difficult to
classify instances and less on those already handled well.
→ AdaBoost algorithms can be used for both classification
and regression problem.
Voting
• Voting is one of the simplest ways of combining the
predictions from multiple machine learning algorithms.
• It works by first creating two or more standalone models
from your training dataset. A Voting Classifier can then be
used to wrap your models and average the predictions of
the sub-models when asked to make predictions for new
data.
• You can create a voting ensemble model for classification
using the VotingClassifier class.
Q.1 What is Gradient Boosting?

Q.2 What is XGBoosting?

Q.3 What is Overfitting?

Q.4 What is Underfitting?

Q.5 How to reduce Overfitting?

Thank You

Ensemble,Voting,Bagging,Boosting
No ratings yet
Ensemble,Voting,Bagging,Boosting
15 pages
ML-Lecture-15-Ensemble
No ratings yet
ML-Lecture-15-Ensemble
27 pages
Ensemble Learning
No ratings yet
Ensemble Learning
26 pages
Module - 5 - ANN
No ratings yet
Module - 5 - ANN
50 pages
Boosting
No ratings yet
Boosting
12 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Module 3.5 Ensemble Learning XGBoost
No ratings yet
Module 3.5 Ensemble Learning XGBoost
26 pages
Module 5,1 Ensemble_Bagging, RF,Boosting
No ratings yet
Module 5,1 Ensemble_Bagging, RF,Boosting
66 pages
UMl - unit 3
No ratings yet
UMl - unit 3
50 pages
Unit V -Multiple Learners
No ratings yet
Unit V -Multiple Learners
54 pages
Chapter Five
No ratings yet
Chapter Five
42 pages
Cornell CS578: Bagging and Boosting
No ratings yet
Cornell CS578: Bagging and Boosting
10 pages
Ensemble methods_b45145f8047e51ea0d65d32fc07eb528
No ratings yet
Ensemble methods_b45145f8047e51ea0d65d32fc07eb528
21 pages
Ensemble Learning
No ratings yet
Ensemble Learning
30 pages
Bagging vs Boosting - Javatpoint
No ratings yet
Bagging vs Boosting - Javatpoint
8 pages
Ensemble_Techniques_Presentation
No ratings yet
Ensemble_Techniques_Presentation
17 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Unit 4 Part 1
No ratings yet
Unit 4 Part 1
47 pages
Unit 4
No ratings yet
Unit 4
17 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
UNIT3_class
No ratings yet
UNIT3_class
30 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Bagging vs Boosting in Machine Learning - GeeksforGeeks
No ratings yet
Bagging vs Boosting in Machine Learning - GeeksforGeeks
9 pages
Random Forest
No ratings yet
Random Forest
20 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Unit-3(1)
No ratings yet
Unit-3(1)
59 pages
Unit-3(1)
No ratings yet
Unit-3(1)
63 pages
M4 - FDS
No ratings yet
M4 - FDS
15 pages
Enseble LEarning
100% (1)
Enseble LEarning
57 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
15 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Bagging & Boosting
No ratings yet
Bagging & Boosting
10 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Bagging Vs Boosting in Machine Learning
No ratings yet
Bagging Vs Boosting in Machine Learning
4 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Unit 4 Ensemble Techniques and Unsupervised Learning
100% (1)
Unit 4 Ensemble Techniques and Unsupervised Learning
25 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Week 11 EnsembleLearning
No ratings yet
Week 11 EnsembleLearning
34 pages
Bagging vs Boosting in Machine Learning
No ratings yet
Bagging vs Boosting in Machine Learning
5 pages
ML UNIT 3-1
No ratings yet
ML UNIT 3-1
14 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Unit 3
No ratings yet
Unit 3
99 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages
UNIT 3 AML
No ratings yet
UNIT 3 AML
9 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
UNIT IV
No ratings yet
UNIT IV
18 pages
Ensemble Methods
100% (1)
Ensemble Methods
15 pages
Hands On Machine Learning With Scikit Learn and TensorFlow Techniques and Tools to Build Learning Machines 1st Edition by AurÃ©lien GÃ©ron 9352135210 9789352135219 - Read the ebook online or download it for the best experience
100% (8)
Hands On Machine Learning With Scikit Learn and TensorFlow Techniques and Tools to Build Learning Machines 1st Edition by AurÃ©lien GÃ©ron 9352135210 9789352135219 - Read the ebook online or download it for the best experience
85 pages
ML Unit-3
No ratings yet
ML Unit-3
28 pages
9. Generative Adversarial Network
No ratings yet
9. Generative Adversarial Network
22 pages
unit 5 ML
No ratings yet
unit 5 ML
14 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
AIML UNIT 4
No ratings yet
AIML UNIT 4
26 pages
Lecture 5
No ratings yet
Lecture 5
11 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Boosting
No ratings yet
Boosting
6 pages
Ensemble Learning and Random Forest 4th
No ratings yet
Ensemble Learning and Random Forest 4th
19 pages
22683-S24-QP
No ratings yet
22683-S24-QP
4 pages
Introduction To Deep Learning
100% (1)
Introduction To Deep Learning
122 pages
CCS355 SET1 Anna University Lab Manual Question Set
100% (1)
CCS355 SET1 Anna University Lab Manual Question Set
3 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
DONG et al 2022 A neural network boosting regression model based on XGBoost
No ratings yet
DONG et al 2022 A neural network boosting regression model based on XGBoost
11 pages
MLP
No ratings yet
MLP
19 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
Clustering Algorithms: K-Means
No ratings yet
Clustering Algorithms: K-Means
17 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
25 pages
Tugas PPT Metopen
No ratings yet
Tugas PPT Metopen
12 pages
Convolutional Neural Networks: CMSC 35246: Deep Learning
No ratings yet
Convolutional Neural Networks: CMSC 35246: Deep Learning
166 pages
Sinngle Layer Perceptron1
No ratings yet
Sinngle Layer Perceptron1
28 pages
Offline Signature Verification
No ratings yet
Offline Signature Verification
13 pages
1cO1CO2: A CO1CO1Co1
No ratings yet
1cO1CO2: A CO1CO1Co1
4 pages
Unit 3 DLT
No ratings yet
Unit 3 DLT
10 pages
NeurIPS 2022 Revised
No ratings yet
NeurIPS 2022 Revised
9 pages
CNN Course V1.3
No ratings yet
CNN Course V1.3
19 pages
Sharda dss10 PPT 06
No ratings yet
Sharda dss10 PPT 06
48 pages
I008 Khemal Experiment-8-PAI
No ratings yet
I008 Khemal Experiment-8-PAI
12 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
52 pages
COMP3308/3608 Artificial Intelligence Week 9 Tutorial Exercises Multilayer Neural Networks 2. Deep Learning
No ratings yet
COMP3308/3608 Artificial Intelligence Week 9 Tutorial Exercises Multilayer Neural Networks 2. Deep Learning
2 pages
02.11 Bibliografia - Referencias - Links Uteis
No ratings yet
02.11 Bibliografia - Referencias - Links Uteis
2 pages
Deep Learning
No ratings yet
Deep Learning
39 pages
Deep Learning NLP and Computer Vision
No ratings yet
Deep Learning NLP and Computer Vision
9 pages
Syllabus ANN
No ratings yet
Syllabus ANN
2 pages
2 Days AI Deep Learning Workshop
No ratings yet
2 Days AI Deep Learning Workshop
9 pages
Week 11
No ratings yet
Week 11
3 pages
Crush Hypothesis Testing
From Everand
Crush Hypothesis Testing
Allison Dillard
No ratings yet
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)

Ensemble Final

Uploaded by

Ensemble Final

Uploaded by

Ensemble Learning Techniques

• Ensemble methods is a machine learning

• Bagging Building multiple models (typically of the

• Boosting. Building multiple models (typically of the

• Building multiple models (typically of differing

• Ensemble methods that minimize

• Ensemble methods that minimize bias

• Q.2 What is the need of ensemble learning

• Q.3 Why only one classifier is not enough

• Q.4 What are the types of Ensemble

• Given a sample of data, multiple subsamples are pulled

Given a Dataset, subsamples are pulled and a Decision Tree is

Patrick 5 1 1 L(h) = E(x,y)~P(x,y)[

In boosting we take records from the dataset and pass it to base

• Suppose we have m number of records in the dataset. Then we

• This will go on unless and until we specify some specific number

• Finally, we combine the output from these base learners and

• Q.2 What do you mean by Boosting?

• Q.3 What is the goal of boosting?

• Q.4 What are different methods of

Gradient boosting is a technique for regression and

The accuracy of a predictive model can be boosted in two ways:

There are many boosting algorithms like

Underfitting – High bias and low variance

• Overfitting refers to a model that models the training data

Overfitting – High variance and low bias

Techniques to reduce overfitting :

1. Increase training data.

2. Reduce model complexity.

3. Early stopping during the training phase (have an eye

4.Use dropout for neural networks to tackle overfitting.

Techniques to reduce underfitting

1. Increase model complexity

2. Increase number of features, performing feature

3. Remove noise from the data.

4. Increase the number of epochs or increase the duration

Q.2 What is XGBoosting?

Q.3 What is Overfitting?

Q.4 What is Underfitting?

Q.5 How to reduce Overfitting?

You might also like