0% found this document useful (0 votes)

61 views

Ensemble Learning-Bagging-Boosting-Stacking

Ensemble learning combines multiple machine learning models to improve predictive performance. It reduces bias and variance through techniques like bagging, boosting, and stacking. Bagging uses bootstrapping and averaging to reduce variance. Boosting sequentially trains models to reduce bias. Stacking combines heterogeneous models to improve accuracy but requires more resources.

Uploaded by

hokijic810

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views

Ensemble Learning-Bagging-Boosting-Stacking

Uploaded by

hokijic810

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Ensemble Learning:

Machine learning is great! But there’s one thing that makes it even better:

ensemble learning. Ensemble learning helps enhance the performance of

machine learning models. The concept behind it is simple. Multiple machine

learning models are combined to obtain a more accurate model.

Bagging, boosting and stacking are the three most popular ensemble learning

techniques. Each of these techniques offers a unique approach to improving

predictive accuracy. Each technique is used for a different purpose, with the

use of each depending on varying factors. Although each technique is

different, many of us find it hard to distinguish between them. Knowing when

or why we should use each technique is difficult.

How Ensemble Learning Works?

Ensemble learning is a learning method that consists of combining multiple

machine learning models.

A problem in machine learning is that individual models tend to perform poorly.

In other words, they tend to have low prediction accuracy. To mitigate this

problem, we combine multiple models to get one with a better performance.

The individual models that we combine are known as weak learners. We call

them weak learners because they either have a high bias or high variance.
Because they either have high bias or variance, weak learners cannot learn

efficiently and perform poorly.

● A high-bias model results from not learning data well enough. It is not

related to the distribution of the data. Hence future predictions will be

unrelated to the data and thus incorrect.

● A high variance model results from learning the data too well. It varies

with each data point. Hence it is impossible to predict the next point

accurately.

Both high bias and high variance models thus cannot generalize properly.

Thus, weak learners will either make incorrect generalizations or fail to

generalize altogether. Because of this, the predictions of weak learners cannot

be relied on by themselves.
As we know from the bias-variance trade-off, an underfit model has high

bias and low variance, whereas an overfit model has high variance and low

bias. In either case, there is no balance between bias and variance. For there

to be a balance, both the bias and variance need to be low. Ensemble

learning tries to balance this bias-variance trade-off by reducing either the

bias or the variance.

Ensemble learning will aim to reduce the bias if we have a weak model with

high bias and low variance. Ensemble learning will aim to reduce the variance

if we have a weak model with high variance and low bias. This way, the

resulting model will be much more balanced, with low bias and variance.

Thus, the resulting model will be known as a strong learner. This model will be

more generalized than the weak learners. It will thus be able to make accurate

predictions.

Ensemble learning improves a model’s performance in mainly three

ways:

● By reducing the variance of weak learners

● By reducing the bias of weak learners,

● By improving the overall accuracy of strong learners.

Bagging is used to reduce the variance of weak learners. Boosting is used to

reduce the bias of weak learners. Stacking is used to improve the overall

accuracy of strong learners.

Reducing Variance with Bagging

We use bagging for combining weak learners of high variance. Bagging aims

to produce a model with lower variance than the individual weak models.

These weak learners are homogenous, meaning they are of the same type.

Bagging is also known as Bootstrap aggregating. It consists of two steps:

bootstrapping and aggregation.

Bootstrapping
Involves resampling subsets of data with replacement from an initial dataset.

In other words, subsets of data are taken from the initial dataset. These

subsets of data are called bootstrapped datasets or, simply, bootstraps.

Resampled ‘with replacement’ means an individual data point can be sampled

multiple times. Each bootstrap dataset is used to train a weak learner.

Aggregating
The individual weak learners are trained independently from each other. Each

learner makes independent predictions. The results of those predictions are

aggregated at the end to get the overall prediction. The predictions are

aggregated using either max voting or averaging.

Max Voting is commonly used for classification problems. It consists of taking

the mode of the predictions (the most occurring prediction). It is called voting

because like in election voting, the premise is that ‘the majority rules’. Each

model makes a prediction. A prediction from each model counts as a single

‘vote’. The most occurring ‘vote’ is chosen as the representative for the

combined model.

Averaging is generally used for regression problems. It involves taking the

average of the predictions. The resulting average is used as the overall

prediction for the combined model.

Steps of Bagging
The steps of bagging are as follows:

● We have an initial training dataset containing n-number of instances.

● We create a m-number of subsets of data from the training set. We take

a subset of N sample points from the initial dataset for each subset.

Each subset is taken with replacement. This means that a specific data

point can be sampled more than once.

● For each subset of data, we train the corresponding weak learners

independently. These models are homogeneous, meaning that they are

of the same type.

● Each model makes a prediction.

● The predictions are aggregated into a single prediction. For this, either

max voting or averaging is used.

Reducing Bias by Boosting

We use boosting for combining weak learners with high bias. Boosting aims to

produce a model with a lower bias than that of the individual models. Like in

bagging, the weak learners are homogeneous.

Boosting involves sequentially training weak learners. Here, each subsequent

learner improves the errors of previous learners in the sequence. A sample of

data is first taken from the initial dataset. This sample is used to train the first

model, and the model makes its prediction. The samples can either be

correctly or incorrectly predicted. The samples that are wrongly predicted are
reused for training the next model. In this way, subsequent models can

improve on the errors of previous models.

Unlike bagging, which aggregates prediction results at the end, boosting

aggregates the results at each step. They are aggregated using weighted

averaging.

Weighted averaging involves giving all models different weights depending on

their predictive power. In other words, it gives more weight to the model with

the highest predictive power. This is because the learner with the highest

predictive power is considered the most important.

Steps of Boosting

Boosting works with the following steps:

● We sample m-number of subsets from an initial training dataset.

● Using the first subset, we train the first weak learner.

● We test the trained weak learner using the training data. As a result of

the testing, some data points will be incorrectly predicted.

● Each data point with the wrong prediction is sent into the second subset

of data, and this subset is updated.

● Using this updated subset, we train and test the second weak learner.

● We continue with the following subset until the total number of subsets

is reached.

● We now have the total prediction. The overall prediction has already

been aggregated at each step, so there is no need to calculate it.

Improving Model Accuracy with Stacking

We use stacking to improve the prediction accuracy of strong learners.

Stacking aims to create a single robust model from multiple heterogeneous

strong learners.

Stacking differs from bagging and boosting in that:

● It combines strong learners

● It combines heterogeneous models

● It consists of creating a Metamodel. A metamodel is a model created

using a new dataset.

Individual heterogeneous models are trained using an initial dataset. These

models make predictions and form a single new dataset using those
predictions. This new data set is used to train the metamodel, which makes

the final prediction. The prediction is combined using weighted averaging.

Because stacking combines strong learners, it can combine bagged or

boosted models.

Steps of Stacking

The steps of Stacking are as follows:

● We use initial training data to train m-number of algorithms.

● Using the output of each algorithm, we create a new training set.

● Using the new training set, we create a meta-model algorithm.

● Using the results of the meta-model, we make the final prediction. The

results are combined using weighted averaging.

When to use Bagging vs Boosting vs Stacking?

If you want to reduce the overfitting or variance of your model, you use

bagging. If you are looking to reduce underfitting or bias, you use boosting. If

you want to increase predictive accuracy, use stacking.

Bagging and boosting both works with homogeneous weak learners. Stacking

works using heterogeneous solid learners.

All three of these methods can work with either classification or regression

problems.

One disadvantage of boosting is that it is prone to variance or overfitting. It is

thus not advisable to use boosting for reducing variance. Boosting will do a

worse job in reducing variance as compared to bagging.

On the other hand, the converse is true. It is not advisable to use bagging to

reduce bias or underfitting. This is because bagging is more prone to bias and

does not help reduce bias.

Stacked models have the advantage of better prediction accuracy than

bagging or boosting. But because they combine bagged or boosted models,

they have the disadvantage of needing much more time and computational

power. If you are looking for faster results, it’s advisable not to use stacking.

However, stacking is the way to go if you’re looking for high accuracy.

Conclusion
One of the first uses of ensemble methods was the bagging technique. This

technique was developed to overcome instability in decision trees. In fact, an

example of the bagging technique is the random forest algorithm. The random

forest is an ensemble of multiple decision trees. Decision trees tend to be

prone to overfitting. Because of this, a single decision tree can’t be relied on

for making predictions. To improve the prediction accuracy of decision trees,

bagging is employed to form a random forest. The resulting random forest has

a lower variance compared to the individual trees.

The success of bagging led to the development of other ensemble techniques

such as boosting, stacking, and many others. Today, these developments are

an important part of machine learning.

The many real-life machine learning applications show these ensemble

methods’ importance. These applications include many critical systems.

These include decision-making systems, spam detection, autonomous

vehicles, medical diagnosis, and many others. These systems are crucial

because they have the ability to impact human lives and business revenues.

Therefore ensuring the accuracy of machine learning models is paramount.

An inaccurate model can lead to disastrous consequences for many

businesses or organizations. At worst, they can lead to the endangerment of

human lives.

Bagging, boosting and stacking are important for ensuring the accuracy of

models. They can help prevent undesirable consequences caused by

inaccurate models. Below are some of the key takeaways from the article:

● Ensemble learning combines multiple machine learning models into a

single model. The aim is to increase the performance of the model.

● Bagging aims to decrease variance, boosting aims to decrease bias,

and stacking aims to improve prediction accuracy.

● Bagging and boosting combine homogenous weak learners. Stacking

combines heterogeneous solid learners.

● Bagging trains models in parallel and boosting trains the models

sequentially. Stacking creates a meta-model.

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Behavioral Observation Checklist
No ratings yet
Behavioral Observation Checklist
9 pages
UMl - unit 3
No ratings yet
UMl - unit 3
50 pages
B43 Exp4 ML
No ratings yet
B43 Exp4 ML
6 pages
learning algorithms
No ratings yet
learning algorithms
24 pages
UNIT3_class
No ratings yet
UNIT3_class
30 pages
Unit 4 Part 1
No ratings yet
Unit 4 Part 1
47 pages
Ensemble Methods Bagging Boosting and Stacking
100% (1)
Ensemble Methods Bagging Boosting and Stacking
19 pages
unit 5 ML
No ratings yet
unit 5 ML
14 pages
Ensemble Learning Helps Improve Machine Learning Results by Combining Several Models
No ratings yet
Ensemble Learning Helps Improve Machine Learning Results by Combining Several Models
2 pages
Ensemble methods_b45145f8047e51ea0d65d32fc07eb528
No ratings yet
Ensemble methods_b45145f8047e51ea0d65d32fc07eb528
21 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
8 pages
7 - Ensemble Techniques-Converted Updated
No ratings yet
7 - Ensemble Techniques-Converted Updated
8 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
Classification Through Ensembling Techniques
No ratings yet
Classification Through Ensembling Techniques
10 pages
Ensemble Learning
No ratings yet
Ensemble Learning
8 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
ML Uint 4-2
No ratings yet
ML Uint 4-2
20 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Ensemble Methods (Final)
No ratings yet
Ensemble Methods (Final)
16 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Improving Model Performance
No ratings yet
Improving Model Performance
11 pages
MTech Seminar II
No ratings yet
MTech Seminar II
10 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages
ensemble
No ratings yet
ensemble
33 pages
Unit IV Aiml
No ratings yet
Unit IV Aiml
32 pages
Ensemble
No ratings yet
Ensemble
14 pages
UE20CS302 Unit3 Slides
No ratings yet
UE20CS302 Unit3 Slides
308 pages
Lecture 5
No ratings yet
Lecture 5
11 pages
Unit 4 Ensemble Techniques and Unsupervised Learning
100% (1)
Unit 4 Ensemble Techniques and Unsupervised Learning
25 pages
Ensemble Learning
No ratings yet
Ensemble Learning
46 pages
Ensembling in Python
No ratings yet
Ensembling in Python
20 pages
Ai ML Unit 4 Notes
No ratings yet
Ai ML Unit 4 Notes
42 pages
Ensemble Methods_ Bagging, Boosting and Stacking _ by Joseph Rocca _ Towards Data Science
No ratings yet
Ensemble Methods_ Bagging, Boosting and Stacking _ by Joseph Rocca _ Towards Data Science
20 pages
Module 5,1 Ensemble_Bagging, RF,Boosting
No ratings yet
Module 5,1 Ensemble_Bagging, RF,Boosting
66 pages
UNIT IV
No ratings yet
UNIT IV
18 pages
Unit 4
No ratings yet
Unit 4
17 pages
Week 11 EnsembleLearning
No ratings yet
Week 11 EnsembleLearning
34 pages
1729585037_ML11_Generalization
No ratings yet
1729585037_ML11_Generalization
40 pages
Bagging & Boosting
No ratings yet
Bagging & Boosting
10 pages
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
No ratings yet
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
21 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Ensemble Methods
No ratings yet
Ensemble Methods
12 pages
Unit 4
No ratings yet
Unit 4
24 pages
E4fbc2f-C755-Ed1a-C18-F18ec25eb0d Ensemble Learning Bagging Boosting and Stacking
No ratings yet
E4fbc2f-C755-Ed1a-C18-F18ec25eb0d Ensemble Learning Bagging Boosting and Stacking
6 pages
Ensemble Methods - Bagging, Boosting and Stacking - Towards Data Science PDF
No ratings yet
Ensemble Methods - Bagging, Boosting and Stacking - Towards Data Science PDF
37 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
39 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
ML Unit 3 r20 Jntuk
No ratings yet
ML Unit 3 r20 Jntuk
22 pages
Ensemble Learning: Proprietary Content. ©great Learning. All Rights Reserved. Unauthorized Use or Distribution Prohibited
No ratings yet
Ensemble Learning: Proprietary Content. ©great Learning. All Rights Reserved. Unauthorized Use or Distribution Prohibited
6 pages
Unit-I (Ensemble Learning)
No ratings yet
Unit-I (Ensemble Learning)
67 pages
Unit 4 Updated Notes
No ratings yet
Unit 4 Updated Notes
13 pages
Jntuk R20 ML Unit-Iii
100% (1)
Jntuk R20 ML Unit-Iii
21 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
4 pages
Ensemble Learning
No ratings yet
Ensemble Learning
30 pages
Unit 3
No ratings yet
Unit 3
99 pages
Teaching Primary Programming with Scratch Pupil Book Year 5
From Everand
Teaching Primary Programming with Scratch Pupil Book Year 5
Phil Bagge
No ratings yet
Underfitting and Overfitting
No ratings yet
Underfitting and Overfitting
4 pages
Bias, Variance, and Tradeoff
No ratings yet
Bias, Variance, and Tradeoff
8 pages
K-Mean Clustering
No ratings yet
K-Mean Clustering
8 pages
Optics
No ratings yet
Optics
3 pages
ArtsSciencesCatalog11 12
No ratings yet
ArtsSciencesCatalog11 12
118 pages
Alamat Jurnal Kesehatan
No ratings yet
Alamat Jurnal Kesehatan
4 pages
Statistical Downscaling Portal
No ratings yet
Statistical Downscaling Portal
20 pages
Comparative Analysis of Machine Learning Algorithms On The Bot-IOT Dataset
No ratings yet
Comparative Analysis of Machine Learning Algorithms On The Bot-IOT Dataset
6 pages
CHCDIV003 ECEC Student Guide
No ratings yet
CHCDIV003 ECEC Student Guide
24 pages
Guidelines For Caribbean Studies Internal Assessment
No ratings yet
Guidelines For Caribbean Studies Internal Assessment
6 pages
Ethnographic Thesis Statement
100% (3)
Ethnographic Thesis Statement
4 pages
Structure May 2019 Magazine
No ratings yet
Structure May 2019 Magazine
76 pages
Gun Culture and Variations
No ratings yet
Gun Culture and Variations
12 pages
(PDF Download) Test Bank For Developmental Psychology Childhood and Adolescence, 4th Canadian Edition: Shaffer Fulll Chapter
100% (7)
(PDF Download) Test Bank For Developmental Psychology Childhood and Adolescence, 4th Canadian Edition: Shaffer Fulll Chapter
47 pages
Final Exam
No ratings yet
Final Exam
8 pages
A Descriptive Study On The Readiness of CCBHS' Grade 11 Students in Terms of Materials For Synchronous and Asynchronous Classes, S.Y. 2020-2021
No ratings yet
A Descriptive Study On The Readiness of CCBHS' Grade 11 Students in Terms of Materials For Synchronous and Asynchronous Classes, S.Y. 2020-2021
40 pages
Lesson-4 Probability-EDA
No ratings yet
Lesson-4 Probability-EDA
33 pages
FRIA - Research Chapter 1-3
No ratings yet
FRIA - Research Chapter 1-3
37 pages
PHD Law Thesis PDF
100% (3)
PHD Law Thesis PDF
4 pages
Chemical Solutions For Chemical Problems
No ratings yet
Chemical Solutions For Chemical Problems
43 pages
CE400 Highway Engineering Quiz # 1
No ratings yet
CE400 Highway Engineering Quiz # 1
6 pages
Doing Better But Feeling Worse
No ratings yet
Doing Better But Feeling Worse
9 pages
Entre Full Ppt-Entrepreneurship and Enterprise Development
67% (3)
Entre Full Ppt-Entrepreneurship and Enterprise Development
311 pages
Siti Nurmala Laela: Strength Cluster Map
No ratings yet
Siti Nurmala Laela: Strength Cluster Map
1 page
Thesis and Dissertation Meaning
100% (1)
Thesis and Dissertation Meaning
6 pages
Audience Measurement: Listenership Viewership
No ratings yet
Audience Measurement: Listenership Viewership
10 pages
Course Description - BSTM PCZC
No ratings yet
Course Description - BSTM PCZC
9 pages
Caribbean Studies Internal Assessment
100% (1)
Caribbean Studies Internal Assessment
22 pages
Sharma 2000 PDF
No ratings yet
Sharma 2000 PDF
36 pages
Tentative Exam Date Sheet Ba Programme
No ratings yet
Tentative Exam Date Sheet Ba Programme
17 pages
Apa Standards of Practice
No ratings yet
Apa Standards of Practice
6 pages
Book Essay
100% (2)
Book Essay
3 pages
Ebooks File (Ebook PDF) Understanding Communication Research Methods All Chapters
100% (8)
Ebooks File (Ebook PDF) Understanding Communication Research Methods All Chapters
51 pages