0% found this document useful (0 votes)

13 views

vsat2k_ML_Ch1a Evaluation of Learning Algorithms - Jan 2025

The document provides an overview of machine learning, focusing on the evaluation of learning algorithms and model accuracy. It discusses key concepts such as bias, variance, overfitting, and underfitting, along with methods to improve model performance. Additionally, it covers different types of models including predictive, descriptive, and prescriptive, and emphasizes the importance of data quality and feature selection in predictive modeling.

Uploaded by

meetttp1210

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

vsat2k_ML_Ch1a Evaluation of Learning Algorithms - Jan 2025

Uploaded by

meetttp1210

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Machine Learning

Evaluation of Learning Algorithms

Satishkumar L. Varma
Department of Information Technology
SVKM's Dwarkadas J. Sanghvi College of Engineering, Vile Parle, Mumbai.
ORCID | Scopus | Google Scholar | Google Site | Website
Outline
● Machine Learning Model
○ Evaluating a Learning Algorithm
○ Evaluating Hypothesis
○ Model Selection and Train/ Validation/ Test Sets
○ Bias Vs variance: Regularization and Bias/ Variance, Learning Curve, Error Analysis
○ Handling Skewed Data: Error Matrices for Skewed Classes
○ Trade-off between Precision and recall
● Model Evaluation
● Model Improvement (Ensemble Learning)
● Optimization

2 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Analytical Models
● Models are used for prediction, description and prescription based on key elements in data.
● These analytical models are used for predicting, understanding and making data-driven decisions.
● Predictive models
○ Designed to make predictions or estimates about future events.
○ These ML models analyze historical data to identify patterns and estimates future outcome.
○ Predictive modelling is a process used in data science to create a mathematical model.
○ Such models predicts an outcome based on input data.
○ Examples: Regression analysis, decision trees, and neural networks.
● Descriptive models
○ Designed to provide the past and present, helping to identify patterns and trends in data.
○ Statistical analysis techniques to extract meaningful information from the data, such as mean, standard
deviation, distribution, and correlations.
○ Examples: Clustering, association rule mining, and anomaly detection.
● Prescriptive models
○ Designed to provide recommendations or guidance based on the available data regarding future actions.
○ Examples: NLP

3 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Predictive Modeling
● Dependent and independent variables are key concepts in predictive modeling and statistics.
● Dependent Variable (DV)
○ The dependent variable is the main factor or outcome that you're interested in predicting or understanding.
○ It's often denoted as "Y" in mathematical equations.
○ In a study or experiment, the DV is the variable that is measured or observed.
○ For example, in a study looking at the effect of studying time on test scores,
■ the test scores would be the DV because they depend on the amount of time spent studying.
● Independent Variable (IV)
○ IV are the factors or variables that are manipulated or controlled in a study.
○ They are used to predict or explain changes in the DV.
○ IV are often denoted as "X" in mathematical equations.
○ IV would be the amount of time spent studying,
■ as this is the variable that is being manipulated to see its effect on test scores.

4 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Predictive Modeling
● The two most commonly employed predictive modeling methods are Regression and NN.
● The accuracy of predictive analytics and every predictive model depends on
● Quality of your data
● Choice of variables

Comparison Descriptive Models Predictive Models

It determines, what happened in the past by It determines, what can happen in the future with
Basic
analyzing stored data. the help past data analysis.
Preciseness It provides accurate data. It produces results does not ensure accuracy.
Practical analysis Standard reporting, query/drill down and ad-hoc Predictive modelling, forecasting, simulation and
methods reporting. alerts.
Require It requires data aggregation and data mining It requires statistics and forecasting methods
Type of approach Reactive approach Proactive approach
Describes the characteristics of the data in a target Carry out the induction over the current and past
Describe
data set. data so that predictions can be made.
what happened? what will happen next?
Methods(in general) where exactly is the problem? what is the outcome if these trends continue?
what is the frequency of the problem? what actions are required to be taken?

5 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Model Accuracy
● Model Accuracy
○ There will always be a slight difference in what our model predicts and the actual predictions.
○ ML models allows machines to perform data analysis and make predictions.
○ However, ML models are not accurate and can make predictions errors.
● Errors in Machine Learning
○ These differences between model predicts and the actual predictions are called errors.
○ The goal of an analyst is not to eliminate errors but to reduce them.
○ There is always a tradeoff between how low you can get errors to be.
○ The prediction errors are usually known as Bias and Variance.
○ Aim of data scientist is to reduce these errors in order to get more accurate results for a particular dataset.
○ Let us understand bias and variance, Bias-variance trade-off, Underﬁtting and Overﬁtting.

6 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Errors in Machine Learning
● Error is a measure of how accurately algorithm can make predictions for seen or unseen dataset.
● We choose the ML model to reduce error and performs best for a particular dataset.
● Two main types of errors present in any ML model
● Bias and Variance in Machine Learning
● Reducible Errors
○ These errors can be reduced to improve the model accuracy.
○ Such errors can further be classified into bias and Variance.
● Irreducible Errors
○ These errors will always be present in the model due to unknown variables.
● Bias and Variance in Machine Learning
● However, achieving the balance between Bias and Variance can be challenging.
● Two common issues that affect model accuracy are overfitting and underfitting.
● These problems are major contributors to poor performance in ML models.

7 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Bias and Variance in Machine Learning
● Bias
● Error due to very simple ML model which doesn’t learn enough details from data.
○ High bias make the model easier to train but fails to captur the underlying complexities of data.
○ High bias typically leads to underﬁtting;
○ i.e model performs poorly on both training and testing data as it fails to learn enough from data.
● Variance
● Error due to perfect ML model which learns too much from the data, including random noise.
○ High-variance model learns not only the patterns but also the noise in the training data
○ High-variance leads to poor generalization on unseen data.
○ High variance typically leads to overﬁtting;
○ i.e model performs well on training data but poorly on testing data.

8 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Bias-Variance Tradeoff
● The goal is to find an optimal balance where both bias and variance are minimized
● The relationship between bias and variance is often referred to as the bias-variance tradeoff.
● Tradeoff highlights the need for balance:
○ Increasing model complexity reduces bias but increases variance (risk of overfitting).
○ Simplifying the model reduces variance but increases bias (risk of underfitting).
● Example:
○ To predict the price of houses based on their size;
○ To draw a line or curve that best fits the data points on a graph.
○ Fitting line captures the trend in the data depends on the complexity of the used model.

9 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Overfitting
● Overfitting: The most common issues faced by Machine Learning engineers and data scientists.
● It occurs when ML model tries to cover all or more than the required data points in the given dataset.
● Training with large data leads to capturing noise and inaccurate data into the training data set.
● It negatively affects the performance of the model.
● Overfitting Example:
○ Training data sets such as 5000 mangoes, 1000 apples, and 1000 papayas.
○ Probability of identifying papaya as mangoes due to large biased data in the training data set; Hence
prediction got negatively affected.
○ Overfitting caused by using non-linear methods of ML algorithms as they build non-realistic data models.
● We can overcome overfitting by using linear and parametric algorithms in the ML models.
● Example: Overfitting models
○ Students who memorize answers instead of understanding the topic.
○ These students do well in practice tests (training) but struggle in real exams (testing).

10 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Overfitting
● Reasons for Overfitting:
○ High variance and low bias.
○ The model is too complex.
○ The size of the training data.
● Methods to reduce overfitting:
○ Increase training dataset.
○ Reduce model complexity by simplifying the model by selecting one with fewer parameters
○ Ridge Regularization and Lasso Regularization
○ Early stopping during the training phase
○ Reduce the noise
○ Reduce features
○ Reduce the number of attributes in training data.
○ Regularization: Controlling / constraining the model:
○ Ensemble Techniques

11 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Underfitting
● Underfitting:
○ Underfitting: It is just the opposite of overfitting.
○ Underfitting occurs when ML model is unable to capture the basic underlying trend of the data.
○ It occurs due to training with fewer data and we try to build a linear model with non-linear data.
○ It provides incomplete and inaccurate data and destroys the accuracy of the ML model.
○ Underfitting occurs when our model is too simple to understand the base structure of the data.
● Example: Underfitting models
○ It is like students who don’t study enough.
○ They don’t do well in practice tests or real exams.
○ The underfitting model has High bias and low variance.

12 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Underfitting
● Reasons for Underfitting:
○ The model is too simple; So it may be not capable to represent the complexities in the data.
○ The input features are not the adequate to influence the target variable.
○ The size of the training dataset is not enough.
○ Excessive regularization are used to prevent the overfitting, it constraint model to capture the data well.
○ Features are not scaled.
● Methods to reduce Underfitting:
○ Increase model complexity
○ Increase the number of epochs to get better results.
○ Increase the training time of the model.
○ Increase the number of features.
○ Increased the quality of features
○ Remove noise from the data
○ Reduce the constraints

13 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Model Evaluation
● Refer slide
○ vsat2k_ML_Ch1b Model Evaluation (Regularization) [ PDF ]

14 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Confusion Matrix
● Confusion matrix shows # of correct and incorrect predictions made by
○ Classiﬁcation model compared to the actual outcomes (target value) in the data.
● The matrix is NxN, where N is the number of target values (classes).
● Performance of such models is commonly evaluated using the data in the confusion matrix.

15 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Confusion Matrix

16 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Confusion Matrix
● Refer slide for more examples
○ Vsat IR - Confusion Matrix Complete [ PDF ]

17 Satishkumar L. Varma www.sites.google.com/view/vsat2k

References
Text books:
1. Ethem Alpaydin, “Introduction to Machine Learning , 4th Edition, The MIT Press, 2020.
2. Peter Harrington, “Machine Learning in Action”, 1st Edition, Dreamtech Press, 2012."
3. Tom Mitchell, “Machine Learning”, 1st Edition, McGraw Hill, 2017.
4. Andreas C, Müller and Sarah Guido, “Introduction to Machine Learning with Python: A Guide for Data Scientists”, 1ed,
O'reilly, 2016.
5. Kevin P. Murphy, “Machine Learning: A Probabilistic Perspective”, 1st Edition, MIT Press, 2012."
Reference Books:
6. Aurélien Géron, “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, 2nd Edition,
Shroff/O'Reilly, 2019.
7. Witten Ian H., Eibe Frank, Mark A. Hall, and Christopher J. Pal., “Data Mining: Practical machine learning tools and
techniques”, 1st Edition, Morgan Kaufmann, 2016.
8. Han, Kamber, “Data Mining Concepts and Techniques”, 3rd Edition, Morgan Kaufmann, 2012.
9. Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar, “Foundations of Machine Learning”, 1ed, MIT Press, 2012.
10. H. Dunham, “Data Mining: Introductory and Advanced Topics”, 1st Edition, Pearson Education, 2006.

1818 Satishkumar L. Varma

Satishkumar L. Varma www.sites.google.com/view/vsat2k
www.sites.google.com/view/vsat2k
Thank You.

Thank You.

19 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Machine Learning-2
No ratings yet
Machine Learning-2
87 pages
2.2 ML Session Bias Variance Tradeoffs
No ratings yet
2.2 ML Session Bias Variance Tradeoffs
38 pages
DL_Unit1 (1)
100% (1)
DL_Unit1 (1)
79 pages
Csa202 Unit 2
No ratings yet
Csa202 Unit 2
36 pages
Machine Learning Math Essentials _12.02.2025
No ratings yet
Machine Learning Math Essentials _12.02.2025
88 pages
Chapter2 1 22
No ratings yet
Chapter2 1 22
9 pages
ML-1-PPT-UNIT-1
No ratings yet
ML-1-PPT-UNIT-1
93 pages
Unit 4
No ratings yet
Unit 4
50 pages
Merge +1
No ratings yet
Merge +1
107 pages
MACHINE LEARNING NOTES ANNA UNIVERSITY
No ratings yet
MACHINE LEARNING NOTES ANNA UNIVERSITY
9 pages
11 July Unit 1 - Copy
No ratings yet
11 July Unit 1 - Copy
47 pages
Machine Learning Volume I 280820241047
No ratings yet
Machine Learning Volume I 280820241047
4 pages
Model Evaluation
No ratings yet
Model Evaluation
29 pages
4 - Bias-Variance Tradeoff
No ratings yet
4 - Bias-Variance Tradeoff
28 pages
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
100% (2)
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
26 pages
Lec8 (1)
No ratings yet
Lec8 (1)
19 pages
module 3 modified
No ratings yet
module 3 modified
48 pages
12 Bias-Variance_Underfit_overfit
No ratings yet
12 Bias-Variance_Underfit_overfit
4 pages
Ensemble Method
No ratings yet
Ensemble Method
12 pages
All DL
No ratings yet
All DL
72 pages
Deep Learning[1]
No ratings yet
Deep Learning[1]
26 pages
Theory in Machine Learning
No ratings yet
Theory in Machine Learning
60 pages
2. Linear Regression, Polynomical, Gradiant Descent
No ratings yet
2. Linear Regression, Polynomical, Gradiant Descent
42 pages
unit-1.2-Perceptron-2024
No ratings yet
unit-1.2-Perceptron-2024
107 pages
Machine Learning Models: by Mayuri Bhandari
No ratings yet
Machine Learning Models: by Mayuri Bhandari
48 pages
Study Notes - Lesson 1 - 7 PDF
No ratings yet
Study Notes - Lesson 1 - 7 PDF
25 pages
module3_DS_ppt
No ratings yet
module3_DS_ppt
68 pages
Unit - 2 Deep Learning
No ratings yet
Unit - 2 Deep Learning
26 pages
Unit 3
No ratings yet
Unit 3
55 pages
Machine Learning Models
No ratings yet
Machine Learning Models
52 pages
ML Models, Model Evaluation Methods, Overfitting, Underfitting Bias Variance Loss Function Hyperparameter and Gradient Descent
No ratings yet
ML Models, Model Evaluation Methods, Overfitting, Underfitting Bias Variance Loss Function Hyperparameter and Gradient Descent
74 pages
ML MAKAUT unit-3
No ratings yet
ML MAKAUT unit-3
6 pages
Regression
No ratings yet
Regression
24 pages
emsemble methods-pages-deleted
No ratings yet
emsemble methods-pages-deleted
2 pages
ML 01
No ratings yet
ML 01
24 pages
Bais and Variance
No ratings yet
Bais and Variance
4 pages
SML Updated UNIT 4
No ratings yet
SML Updated UNIT 4
44 pages
Training Evaluation
No ratings yet
Training Evaluation
42 pages
ML Models Concepts
No ratings yet
ML Models Concepts
32 pages
PA DL Consolidated
No ratings yet
PA DL Consolidated
94 pages
unit 4
No ratings yet
unit 4
34 pages
Unit IV
No ratings yet
Unit IV
51 pages
Model Evaluation-I
No ratings yet
Model Evaluation-I
68 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
Introduction To ML
No ratings yet
Introduction To ML
55 pages
Machine Learning Models
No ratings yet
Machine Learning Models
54 pages
Lec 3
No ratings yet
Lec 3
13 pages
Bias and Variance
No ratings yet
Bias and Variance
7 pages
Jkkklphftbbhuii
No ratings yet
Jkkklphftbbhuii
17 pages
July4 SaketAnand FriendlyIntroToML
No ratings yet
July4 SaketAnand FriendlyIntroToML
84 pages
ZG512 L1 Introduction, Bias-Variance 270724
No ratings yet
ZG512 L1 Introduction, Bias-Variance 270724
19 pages
Machine Learning Notes "2023
No ratings yet
Machine Learning Notes "2023
31 pages
24.-Bias-and-Variance
No ratings yet
24.-Bias-and-Variance
15 pages
EDA Module 2
No ratings yet
EDA Module 2
28 pages
Linear Regression Summary
No ratings yet
Linear Regression Summary
57 pages
Bias and Variance in Machine Learning
100% (1)
Bias and Variance in Machine Learning
7 pages
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
Core Concepts in Statistical Learning
From Everand
Core Concepts in Statistical Learning
Tushar Gulati
No ratings yet
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
Dredging Permit
No ratings yet
Dredging Permit
6 pages
Ganaba,-WPS Office
No ratings yet
Ganaba,-WPS Office
2 pages
Friday Features Year 2013
No ratings yet
Friday Features Year 2013
115 pages
En Datasheet TrinaTracker Agile1P 210827
No ratings yet
En Datasheet TrinaTracker Agile1P 210827
2 pages
Year 2 Full Summer Term
100% (1)
Year 2 Full Summer Term
82 pages
WPA S1200+ Manual
No ratings yet
WPA S1200+ Manual
72 pages
Science-9-Q4-Week7-MELC07-Module7-ColomaRyner Readytoprint
No ratings yet
Science-9-Q4-Week7-MELC07-Module7-ColomaRyner Readytoprint
38 pages
OTS80 60PB DS en V01
No ratings yet
OTS80 60PB DS en V01
4 pages
Distance by Taping
No ratings yet
Distance by Taping
2 pages
Community Healthcare Family Nursing Care Plan Presence of Breeding Sites
No ratings yet
Community Healthcare Family Nursing Care Plan Presence of Breeding Sites
2 pages
QAA Manual
No ratings yet
QAA Manual
12 pages
Assessment Task 1 ECE2202 Word 1
No ratings yet
Assessment Task 1 ECE2202 Word 1
3 pages
TSLB 3013: Task 2: Presentation
No ratings yet
TSLB 3013: Task 2: Presentation
21 pages
Detailed Lesson Plan in Grade IV-Mathematics
No ratings yet
Detailed Lesson Plan in Grade IV-Mathematics
6 pages
Thinking About Anya S Ghost Summary
No ratings yet
Thinking About Anya S Ghost Summary
3 pages
QUESTIONNAIRE On Service Delivery in Local Authorities
100% (3)
QUESTIONNAIRE On Service Delivery in Local Authorities
4 pages
Complete Download Instrumentation, Measurement and Analysis 4th Edition Chaudhary Nakra PDF All Chapters
100% (1)
Complete Download Instrumentation, Measurement and Analysis 4th Edition Chaudhary Nakra PDF All Chapters
57 pages
4 Factor Analysis
No ratings yet
4 Factor Analysis
16 pages
IPTSTS 049 - The Making of The Avicennan Tradition - The Transmission, Contents, and Structure of Ibn Sīnā's Al-Mubāḥaṭāt (The Discussions) PDF
No ratings yet
IPTSTS 049 - The Making of The Avicennan Tradition - The Transmission, Contents, and Structure of Ibn Sīnā's Al-Mubāḥaṭāt (The Discussions) PDF
357 pages
Matching Hypothesis Coursework
100% (2)
Matching Hypothesis Coursework
8 pages
Expt 2 Transfer Function 1
No ratings yet
Expt 2 Transfer Function 1
6 pages
Pre Ielts 2ND Term Test
No ratings yet
Pre Ielts 2ND Term Test
5 pages
1 s2.0 S1566253522002081 Main
No ratings yet
1 s2.0 S1566253522002081 Main
19 pages
III CSE-D-NAMELIST (1)
No ratings yet
III CSE-D-NAMELIST (1)
2 pages
Learning Journal Unit 5 Bus 4402 OB
No ratings yet
Learning Journal Unit 5 Bus 4402 OB
12 pages
Qa1 4
No ratings yet
Qa1 4
3 pages
Micronics PFD550Manual Series A 1.3 Spanish
No ratings yet
Micronics PFD550Manual Series A 1.3 Spanish
4 pages
1 PDF
No ratings yet
1 PDF
4 pages
Realist Methodology
No ratings yet
Realist Methodology
29 pages
ME685 Homework3
No ratings yet
ME685 Homework3
16 pages

vsat2k_ML_Ch1a Evaluation of Learning Algorithms - Jan 2025

Uploaded by

vsat2k_ML_Ch1a Evaluation of Learning Algorithms - Jan 2025

Uploaded by

Machine Learning

Evaluation of Learning Algorithms

2 Satishkumar L. Varma www.sites.google.com/view/vsat2k

3 Satishkumar L. Varma www.sites.google.com/view/vsat2k

4 Satishkumar L. Varma www.sites.google.com/view/vsat2k

Comparison Descriptive Models Predictive Models

5 Satishkumar L. Varma www.sites.google.com/view/vsat2k

6 Satishkumar L. Varma www.sites.google.com/view/vsat2k

7 Satishkumar L. Varma www.sites.google.com/view/vsat2k

8 Satishkumar L. Varma www.sites.google.com/view/vsat2k

9 Satishkumar L. Varma www.sites.google.com/view/vsat2k

10 Satishkumar L. Varma www.sites.google.com/view/vsat2k

11 Satishkumar L. Varma www.sites.google.com/view/vsat2k

12 Satishkumar L. Varma www.sites.google.com/view/vsat2k

13 Satishkumar L. Varma www.sites.google.com/view/vsat2k

14 Satishkumar L. Varma www.sites.google.com/view/vsat2k

15 Satishkumar L. Varma www.sites.google.com/view/vsat2k

16 Satishkumar L. Varma www.sites.google.com/view/vsat2k

17 Satishkumar L. Varma www.sites.google.com/view/vsat2k

1818 Satishkumar L. Varma

19 Satishkumar L. Varma www.sites.google.com/view/vsat2k

You might also like