0% found this document useful (0 votes)

15 views

Lecture 2.1 - AML

Uploaded by

Vivek Sreekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

Lecture 2.1 - AML

Uploaded by

Vivek Sreekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Advanced Machine

Learning with
TensorFlow
22TCSE532
Lecture_2.1
Introduction to Ensemble Methods
Ensemble methods use multiple models together to make better predictions than a single model can.

Ensemble

Bagging Boosting

● Trains several models on ● Trains models one after another, each

different parts of the data. trying to fix the mistakes of the
previous one.
● Combines their predictions to
make a final decision (e.g., ● Combines their predictions to make a
Random Forest). stronger model (e.g., AdaBoost,
Gradient Boosting).
Advantages of Ensemble Methods:

● Reduces Overfitting:
○ By using many models, the final prediction is less likely to be overly tailored to the
training data, making it perform better on new data.

● Improves Accuracy:
○ Ensemble methods usually give more accurate results than using a single model
because they combine the strengths of multiple models.
Row Sampling with Replacement

m<n
M1
d1`m - For each and every model we will provide
the sample of the dataset D (with n
records) as d1`m (m is no of records).
M2
- For M2 we will again resample the records
and pick the other sample of records to
Dataset give input to them.
d2`m
M3 - This is basically called as row sampling
n with replacement.

- d1` not equal to d2`.Although some

records may get repeated.

Mn
- Take the test data d`` and get the
d`` predictions(output).
M1 1
m<n - Once we get the output for all
d1`m
different models, then we will
d`` (Test Data) apply voting classifier.
M2 0 - Now majority of the votes given
Dataset d2`m 1 as output will be considered final
output.
M3 1
n

BOOTSTRAP AGGREGATION
Mn 1
d`` 1
M1
m<n
d1`m

d`` (Test Data)

M2 0 Feature(column) sampling
with replacement is also
Dataset d2`m 1 done in RF.
M3 1
n

BOOTSTRAP AGGREGATION
Mn 1

IN RANDOM FOREST M1, M2… Mn ARE REPLACED WITH DECISION TREES.

Random Forest

d`` 1
DT1
m<n
r ’X n’ Base learner is Decision
Tree.
d`` (Test Data) DT2 0
r Dataset 1
DT3 1
n

n = no of columns 1
DTn
r = no of rows
Whenever we create decision tree to it’s complete depth

Decision Tree has 2 properties:-

• Low BIAS(means it will get trained so well on training dataset such that training
error will be very less)

• High VARIANCE(for the test data these decision trees will be prone to give
larger amount of errors)

THAT IS WHY WHEN IT IS CREATED TO ITS COMPLETE DEPTH IT LEADS TO

OVERFITTING.
Now what is happening in Random Forest?

In RF we use multiple decision tree, and as we discussed in last

slide each and every decision tree will have high variance BUT

When we combine these Decision Trees wrt majority vote

WHAT WILL HAPPEN?

HIGH VARIANCE LOW VARIANCE

When we combine the Decision trees wrt to majority

vote then high variance is get converted into low
variance.
d`` 1
DT1 Due to Feature
m<n
r ’X n’
and row sampling
d`` (Test Data) newly added data
DT2 0 will be splitted
across all the
1000 r Dataset 1 models, so data
DT3 1 change will not
200 impact the score
of any individual
n model. It will still
generalise it.
n = no of columns 1
DTn
r = no of rows
What if we are handling a regression problem?

d``
DT1 1.14 We will either take
m<n mean or median of
r ’X n’
d`` (Test Data) the outputs. It
DT2 0.95 depends upon the
r Dataset
0.75 distribution of the
output.
DT3 1.05

n Hyperparameter = No of
Decision Trees
n = no of columns 0.87
DTn
r = no of rows
What is Out of bag evaluation in Random Forest(Bagging)?

d`` 1
DT1
m<n OOB Score
r ’X n’
d`` (Test Data)
DT2 0

1000 r Dataset 1
DT3 1
k

Out of n
Bag(OOB)
n = no of columns 1
DTn
r = no of rows
Data

Train Test

Train Validation

⅔*n ⅓*n
If I will set the OOB parameter to TRUE, OOB data will
become/considered as a Validation data.

What is OOB score?

It is nothing but the accuracy wrt val dataset.

What is Boosting?

● Deﬁnition: Sequentially trains models to correct errors of previous

models.

● Process:
○ Models are trained one after another, each trying to correct the
mistakes of the previous one.
○ Combines weak learners to form a strong learner.

● Types: AdaBoost, Gradient Boosting, XGBoost, CatBoost.

Process:

1. Initialization:
○ Start with an initial model trained on the data.

2. Sequential Training:
○ Train a series of models sequentially.
○ Each new model focuses on the errors made by
the previous models.

3. Weight Adjustment:
○ Increase the weight of incorrectly predicted
examples to emphasize their importance in
subsequent training.

4. Combination:
○ Combine the predictions of all models to make
the final prediction (e.g., weighted sum for
regression, majority voting for classification).
What is AdaBoost?(Adaptive Boosting)

Adjusts the weights of incorrectly classified examples so that subsequent models

focus more on difficult cases.

In AdaBoost decision tree is created with only 1 depth

Stumps
Sample
weight

Calculating sample
weight

w= 1/n
Entropy or gini coeﬃcient or we can
use both to select the stump, the one Selecting a base
with least value will be selected learner
f1 f2 f3
Let say DT with f1 (employee id) is
selected

f1
5
1
Now we need to ﬁnd the total error for the
record which is incorrectly classiﬁed Finding Total Error

Total error = Sum of weight of wrong output

Total error = 1/6

performance of the stump =

= ½ ln [5]

= 0.804

Finding the
Why we have calculated the total error and performance of the
performance of the stump? stump
Because we need to update the sample weight.
The weight for the correct predictions will be
reduced and wrong predictions increase before
sending data to 2nd base learner.
Now we have to increase the weight of wrong classified record
and we have to decrease the weight for the correctly classified
record.
Update the weight of incorrectly classified point

New Sample weight = weight X e performance

= 1/6 X e 0.804
= 0.372

Update Weight
Update the weight of incorrectly classiﬁed point

New Sample weight = weight X e -performance

= 1/6 X e -0.804
= 0.07

Update Weight

We can see that sample weight adds up to 1 but the updated weight does not.
Divide by sum of
Because summation of all the values under updated weight is updated weights
not 1 therefore i.e 0.72 Normalizing Updated
Weights
Now we will be using the Normalised weights and select the all
misclassiﬁed records for the second base learner to learn .
Choosing the Second Base Learner

1. Selecting the Second Learner:

○ In the second iteration, you again train a weak learner on the re-weighted
dataset.
○ The new learner will focus more on the samples that were misclassified in the
first iteration because their weights have been increased.
2. Identifying Wrong Records:
○ The misclassified records from the first iteration are those where the prediction
of the first weak learner does not match the true label.
○ These samples now have higher weights, meaning the second weak learner
will give them more importance during training. Select second base
Iterative Process learner
● This process repeats for a specified number of iterations or until a certain error
threshold is reached.
● Each weak learner contributes to the final strong classifier through a weighted vote
based on its accuracy.
Model Validation vs. Model Testing: Overview
Model Validation: This step involves tuning and evaluating the model's performance
during the training phase. It uses a validation set (distinct from the training data) to
assess the model's accuracy and adjust hyperparameters to improve its
generalization capabilities. The goal is to ensure that the model is not overfitting to the
training data.

Model Testing: This is the final evaluation step, performed after the model is trained
and validated. It uses a test set (which the model has never seen before) to measure
the model's true performance in a real-world scenario. The results on the test set
provide an unbiased estimate of how the model will perform on new, unseen data.

Instant ebooks textbook Build a Large Language Model (From Scratch) (MEAP V01) Sebastian Raschka download all chapters
100% (3)
Instant ebooks textbook Build a Large Language Model (From Scratch) (MEAP V01) Sebastian Raschka download all chapters
34 pages
Ramsey Dukes - Uncle Ramsey's Little Book of Demons
100% (11)
Ramsey Dukes - Uncle Ramsey's Little Book of Demons
265 pages
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Leading A Dalcroze Activity Lesson Plan
No ratings yet
Leading A Dalcroze Activity Lesson Plan
2 pages
Tennis Lesson
No ratings yet
Tennis Lesson
5 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Lecture 17 - Ensemble Learning
No ratings yet
Lecture 17 - Ensemble Learning
31 pages
Unit 3
No ratings yet
Unit 3
99 pages
ENsemble, Random Forest
No ratings yet
ENsemble, Random Forest
28 pages
ML-Unit I - Ensemble Methods
No ratings yet
ML-Unit I - Ensemble Methods
54 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
Overfitting & Feature Engineering.pptx
No ratings yet
Overfitting & Feature Engineering.pptx
37 pages
lecture slide 12
No ratings yet
lecture slide 12
22 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
22 Boosting
No ratings yet
22 Boosting
32 pages
16 Boosting
No ratings yet
16 Boosting
7 pages
09_EnsembleLearning
No ratings yet
09_EnsembleLearning
36 pages
Unit-3 ML P (1) PPTs by DR KSR
No ratings yet
Unit-3 ML P (1) PPTs by DR KSR
21 pages
AIML Lect6 Ensembles
No ratings yet
AIML Lect6 Ensembles
41 pages
Ensemble Classification
No ratings yet
Ensemble Classification
25 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages
1729585037_ML11_Generalization
No ratings yet
1729585037_ML11_Generalization
40 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Machine Learning General: Definiton
No ratings yet
Machine Learning General: Definiton
14 pages
UE20CS302 Unit3 Slides
No ratings yet
UE20CS302 Unit3 Slides
308 pages
14 Model Ensembles
No ratings yet
14 Model Ensembles
63 pages
Ensemble Method
No ratings yet
Ensemble Method
8 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Random Forest
No ratings yet
Random Forest
20 pages
Ensemble
No ratings yet
Ensemble
14 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
116 pages
2025 Ensemble Learning.docx
No ratings yet
2025 Ensemble Learning.docx
25 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
Unit-3(1)
No ratings yet
Unit-3(1)
63 pages
Ensembles 1
No ratings yet
Ensembles 1
4 pages
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
No ratings yet
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
61 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
37 pages
Codes and Concepts of ML-Developer-2
No ratings yet
Codes and Concepts of ML-Developer-2
17 pages
unit 5 ML
No ratings yet
unit 5 ML
14 pages
MLquestions
No ratings yet
MLquestions
26 pages
ml1_Lab_6
No ratings yet
ml1_Lab_6
5 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
ML Minors Exp8
No ratings yet
ML Minors Exp8
6 pages
Unit-3(1)
No ratings yet
Unit-3(1)
59 pages
Ensemble Methods
No ratings yet
Ensemble Methods
30 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Ensemble Methods.pptx
No ratings yet
Ensemble Methods.pptx
32 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Ens Embling
No ratings yet
Ens Embling
19 pages
Accuracy and Error Measures
No ratings yet
Accuracy and Error Measures
46 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
25 pages
boosting
No ratings yet
boosting
28 pages
کتاب هفتم بارگزاری شده
No ratings yet
کتاب هفتم بارگزاری شده
57 pages
Random Forest
No ratings yet
Random Forest
10 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
Lecture 4.1 AML
No ratings yet
Lecture 4.1 AML
12 pages
Question Bank For Module 2 and Module 3
No ratings yet
Question Bank For Module 2 and Module 3
5 pages
TS - SR - Maths Iib - Imp Questions
No ratings yet
TS - SR - Maths Iib - Imp Questions
7 pages
Ts SR Maths Iia Imp Questions
No ratings yet
Ts SR Maths Iia Imp Questions
7 pages
JEE Main 2022 (June 24 Morning Shift) Question Paper With Solutions (PDF)
No ratings yet
JEE Main 2022 (June 24 Morning Shift) Question Paper With Solutions (PDF)
29 pages
Modul E: Personal Health Prevention and Control of Diseases and Disorders: A Preliminary Acquisition
No ratings yet
Modul E: Personal Health Prevention and Control of Diseases and Disorders: A Preliminary Acquisition
8 pages
Implementing Cognitive Strategy Instruction & Developing Self-Regulated Learners
No ratings yet
Implementing Cognitive Strategy Instruction & Developing Self-Regulated Learners
8 pages
Education and Learning
No ratings yet
Education and Learning
2 pages
Iugbhr@iugb - Edu.ci: Marie Benedicte Niamien Email: Mobile: +225 89344226
No ratings yet
Iugbhr@iugb - Edu.ci: Marie Benedicte Niamien Email: Mobile: +225 89344226
1 page
2018 2019 - Academic Calendar PDF
No ratings yet
2018 2019 - Academic Calendar PDF
4 pages
1 Facilitating Learner Centered Teaching: Romblon, Philippines
100% (4)
1 Facilitating Learner Centered Teaching: Romblon, Philippines
47 pages
4.2 - Robustness of Traditional Optimization and Search Techniques
No ratings yet
4.2 - Robustness of Traditional Optimization and Search Techniques
4 pages
Red - Session Plan On Beauty Care
No ratings yet
Red - Session Plan On Beauty Care
16 pages
UM Handbook PDF
No ratings yet
UM Handbook PDF
159 pages
Task 2 Critical Analysis of Teaching Practice Instructions
No ratings yet
Task 2 Critical Analysis of Teaching Practice Instructions
3 pages
Behaviorism 1
No ratings yet
Behaviorism 1
20 pages
Annotated Imaginative Recount PDF
No ratings yet
Annotated Imaginative Recount PDF
3 pages
The Happy Handwriter How Do I Help My Child To Write On The Lines
No ratings yet
The Happy Handwriter How Do I Help My Child To Write On The Lines
4 pages
DLL MATATAG _MATH 4 Q2 W1
No ratings yet
DLL MATATAG _MATH 4 Q2 W1
19 pages
Navedtra 134A August 2009: Naval Education and Training Command Training Manual (Traman)
No ratings yet
Navedtra 134A August 2009: Naval Education and Training Command Training Manual (Traman)
190 pages
Philippines Basic Education Sector Transformation Program Independent Program Review Report
No ratings yet
Philippines Basic Education Sector Transformation Program Independent Program Review Report
76 pages
Lesson 3rd grade - 1-2-2025 - Reading soccer shootour
No ratings yet
Lesson 3rd grade - 1-2-2025 - Reading soccer shootour
3 pages
Nursery Lesson Plan Week 1
No ratings yet
Nursery Lesson Plan Week 1
2 pages
English November 4 2024 Monday
No ratings yet
English November 4 2024 Monday
12 pages
Lesson 14 Learning about the weather and how to create a cloud (2)
No ratings yet
Lesson 14 Learning about the weather and how to create a cloud (2)
4 pages
Cot Math5 Week9
No ratings yet
Cot Math5 Week9
13 pages
Data Is Power
No ratings yet
Data Is Power
4 pages
Psychological-Report-Assignment-Wais
No ratings yet
Psychological-Report-Assignment-Wais
6 pages
Tasks 4
No ratings yet
Tasks 4
2 pages
DETAILED LP GRADE 7 Sides and Angles of A Polygon
No ratings yet
DETAILED LP GRADE 7 Sides and Angles of A Polygon
5 pages
Excel Skills For Business Specialization
No ratings yet
Excel Skills For Business Specialization
1 page