18-Deep Learning Frameworks - Data Augmentation - Under-Fitting Vs Over-Fitting-22!08!2024

Uploaded by

gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views5 pages

18-Deep Learning Frameworks - Data Augmentation - Under-Fitting Vs Over-Fitting-22!08!2024

Uploaded by

gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

In deep learning, overfitting and underfitting are two common issues that negatively affect

the performance of models on unseen data. They represent two extremes of the bias-variance
tradeoff.
Overfitting
 Overfitting occurs when a deep learning model learns the training data too well,
including the noise and fluctuations.
 The model becomes extremely well-tuned to the specifics of the training data, making
it perform exceptionally well on this data but poorly on new, unseen data because it
has not learned the underlying patterns, but rather memorized the training set.
Characteristics of Overfitting:
- High accuracy on training data but poor generalization to new data.
- The model captures noise and random fluctuations in the training data as if they were
meaningful concepts.
- The learning curve shows that as training progresses, the training error decreases, but
the validation error starts to increase after a certain point.
Common Causes:
- Excessively complex model with too many parameters.
- Insufficient amount of training data.
- Lack of regularization or too little regularization.
- Training for too many epochs, leading to the model starting to memorize the data.
How to Combat Overfitting:
- Simplify the model (reduce its complexity).
- Collect more training data or augment the existing dataset.
- Apply regularization techniques (L1, L2, dropout).
- Use early stopping during training.
- Implement cross-validation.
- Utilize data augmentation techniques to increase the diversity of the training set.
Underfitting
Underfitting, on the other hand, occurs when a model is too simple to capture the underlying
structure of the data. It does not learn well from the training data and, consequently, performs
poorly on both the training data and unseen data.
Characteristics of Underfitting:
- Low accuracy on both training and validation data.
- The model is too simple to capture the complexities and patterns in the data.
- The learning curve shows high bias, with both training and validation errors being
high.
Common Causes:
- The model is too simple with very few parameters (high bias).
- Insufficient training (too few epochs).
- Overly strong regularization that prevents the model from fitting the data well.
- Poor choice of features in the input data that fails to capture important characteristics.
How to Combat Underfitting:
- Increase model complexity (e.g., more layers, more units per layer).
- Train longer or with more epochs until performance improves.
- Reduce regularization strength.
- Feature engineering to ensure that important characteristics of the data are being fed
into the model.
- Tune model hyperparameters to find a better architecture for the problem at hand.
The key to addressing overfitting and underfitting is to strike a balance where the model is
complex enough to learn the underlying patterns in the data but not so complex that it learns
the noise and details specific to the training set. This balance can typically be achieved
through a combination of model selection, regularization, and tuning, alongside a robust
training methodology.
Regularization techniques are methods used to prevent overfitting by imposing
constraints on the model or its learning process. Here are some common regularization
techniques used in deep learning:

L1 Regularization (Lasso)
L1 regularization, also known as Lasso (Least Absolute Shrinkage and Selection Operator),
adds a penalty equal to the absolute value of the magnitude of the coefficients. This can lead
not only to smaller weights but can also produce some coefficients that are exactly zero,
effectively performing feature selection.

from keras.regularizers import l1

# Adding L1 regularization to a dense layer in Keras
layer = Dense(units, activation='relu', kernel_regularizer=l1(0.01))

L2 Regularization (Ridge)
L2 regularization, also known as Ridge, adds a penalty equal to the square of the magnitude
of the coefficients. This penalty encourages the weights to be small but does not enforce them
to be zero.

from keras.regularizers import l2

# Adding L2 regularization to a dense layer in Keras
layer = Dense(units, activation='relu', kernel_regularizer=l2(0.01))
Elastic Net Regularization
Elastic Net is a combination of L1 and L2 regularization and is useful when multiple features
are correlated with one another.

from keras.regularizers import l1_l2

# Adding Elastic Net regularization to a dense layer in Keras
layer = Dense(units, activation='relu', kernel_regularizer=l1_l2(l1=0.01, l2=0.01))

Dropout
Dropout is a regularization technique that involves randomly setting a fraction of input units
to 0 at each update during training time, which helps prevent overfitting.

from keras.layers import Dropout

# Adding Dropout to a model in Keras
model.add(Dropout(0.5)) # Dropout 50% of the neurons

Early Stopping
Early stopping involves stopping training before the model has fully fitted to the training
data. When the performance on the validation set starts to deteriorate, training is halted to
prevent overfitting.

from keras.callbacks import EarlyStopping

# Using EarlyStopping in Keras
early_stopping = EarlyStopping(monitor='val_loss', patience=5)

Batch Normalization
Although primarily used to help with training stability and speed, batch normalization can
have a regularizing effect by reducing the internal covariate shift.

from keras.layers import BatchNormalization

# Adding Batch Normalization to a model in Keras
model.add(BatchNormalization())

Data Augmentation
Data augmentation artificially increases the size and diversity of the training dataset by
applying random, but realistic, transformations to the input images, such as rotation, scaling,
and cropping.
from keras.preprocessing.image import ImageDataGenerator
# Example of using ImageDataGenerator for data augmentation
datagen = ImageDataGenerator(rotation_range=20, width_shift_range=0.2,
height_shift_range=0.2, horizontal_flip=True)
Noise Injection
Adding noise to inputs can improve robustness and reduce overfitting, as the model learns to
ignore the noise and focus on the underlying data patterns.

from keras.layers import GaussianNoise

# Adding Gaussian Noise to a model in Keras
model.add(GaussianNoise(0.1))

Ensemble Methods
Combining the predictions from multiple models can improve generalization by reducing the
model's variance. Ensemble methods include techniques like bagging, boosting, and stacking.

Weight Constraints
By imposing constraints on the norm of the weights, you can ensure that no individual weight
can have a disproportionately large impact on the outcome, promoting a more distributed and
generalized model.

from keras.constraints import max_norm

from keras.layers import Dense
# Adding a max-norm weight constraint to a dense layer
model.add(Dense(units, activation='relu', kernel_constraint=max_norm(2.)))

These regularization techniques can be used alone or in combination to combat overfitting.

The choice of technique(s) often depends on the specific problem, data characteristics, and
the model being used.

The bias-variance tradeoff is a fundamental concept in supervised learning that describes the
tradeoff between two types of error that affect the performance of a machine learning model:
Bias
Bias refers to the error that results from incorrect assumptions in the learning algorithm. High
bias can cause the model to miss relevant relations between features and target outputs
(underfitting), leading the model to perform poorly on both training and the unseen data.
 High Bias: Simplistic models with high bias pay little attention to the training data
and oversimplify the model, which can lead to a model that does not capture the
complexity of the data (e.g., linear models).
 Low Bias: Complex models with low bias pay more attention to the training data and
can capture more complex relationships (e.g., deep neural networks).
Variance
Variance refers to the error that results from sensitivity to small fluctuations in the training
set. High variance can cause an algorithm to model the random noise in the training data
rather than the intended outputs (overfitting), leading to poor performance on unseen data.
 High Variance: Models with high variance follow the training data very closely (e.g.,
a model with many parameters such as a deep neural network without proper
regularization).
 Low Variance: Models with low variance are not as affected by the specifics of the
training data and are simpler (e.g., linear models).
Tradeoff
The tradeoff is that to achieve a good model performance, you need to find a balance between
bias and variance, minimizing both errors. Here's why:
 High Bias/Low Variance Models often lead to underfitting, where the model is not
complex enough to capture underlying patterns in the data, and hence has low
predictive performance on both training and unseen data.
 Low Bias/High Variance Models often lead to overfitting, where the model is overly
complex and captures noise in the training data, performing well on the training data
but poorly on unseen data.
In general, as model complexity increases, bias tends to decrease and variance tends to
increase, and vice versa. The ideal situation is to have both low bias and low variance, but in
practice, it's often necessary to compromise:
 Simple models (few parameters): tend to have high bias and low variance.
 Complex models (many parameters, such as deep learning): tend to have low bias
and high variance.
Minimizing the Tradeoff
Machine learning practitioners aim to minimize this tradeoff by:
 Choosing the right model complexity for the given problem and data.
 Using techniques like cross-validation to estimate model performance.
 Implementing regularization techniques to reduce overfitting.
 Gathering more data or constructing a more representative feature set to reduce bias
without increasing variance.
 Using ensemble methods that combine the predictions of several models to reduce
variance.
Understanding and balancing the bias-variance tradeoff is key to building models that
generalize well to new, unseen data.

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Class XI AI Qestion Bank
63% (8)
Class XI AI Qestion Bank
6 pages
Supervised Vs Unsupervised Learning
100% (1)
Supervised Vs Unsupervised Learning
7 pages
Unit 4
No ratings yet
Unit 4
35 pages
Nndl Notes
No ratings yet
Nndl Notes
73 pages
Overfitting vs Underfitting
No ratings yet
Overfitting vs Underfitting
8 pages
Underfitting (2)
No ratings yet
Underfitting (2)
13 pages
Overfitting_Underfitting
No ratings yet
Overfitting_Underfitting
2 pages
AI - W7L14
No ratings yet
AI - W7L14
22 pages
DL+lect+7 (1)
No ratings yet
DL+lect+7 (1)
15 pages
Bias and Variance in Machine Learning
No ratings yet
Bias and Variance in Machine Learning
3 pages
Samatrix Assignment3
No ratings yet
Samatrix Assignment3
4 pages
Overfitting
No ratings yet
Overfitting
7 pages
DL Class3
No ratings yet
DL Class3
28 pages
Overfitting vs Underfitting
No ratings yet
Overfitting vs Underfitting
3 pages
Underfitting and Overfitting
No ratings yet
Underfitting and Overfitting
4 pages
04 the Problem of Over Fitting Model Assessment
No ratings yet
04 the Problem of Over Fitting Model Assessment
3 pages
Understanding Overfitting, Underfitting, Oversampling, and SMOTE in Machine Learning
No ratings yet
Understanding Overfitting, Underfitting, Oversampling, and SMOTE in Machine Learning
9 pages
U&O Fitting
No ratings yet
U&O Fitting
6 pages
Overfitting vs Underfitting
No ratings yet
Overfitting vs Underfitting
14 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
5 pages
Overfitting and Underfitting in Machine Learning
No ratings yet
Overfitting and Underfitting in Machine Learning
3 pages
Data Science Concepts Overfitting Underfitting
No ratings yet
Data Science Concepts Overfitting Underfitting
8 pages
Curs5site PDF
No ratings yet
Curs5site PDF
47 pages
DL UNIT2
No ratings yet
DL UNIT2
22 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
Regularization Slides (2)
No ratings yet
Regularization Slides (2)
50 pages
An Overview of Overfitting and Its Solutions
No ratings yet
An Overview of Overfitting and Its Solutions
7 pages
LECTURE - 1
No ratings yet
LECTURE - 1
35 pages
Underfitting Overfitting
No ratings yet
Underfitting Overfitting
38 pages
Bias_and_Variance
No ratings yet
Bias_and_Variance
4 pages
Machine Leafning
No ratings yet
Machine Leafning
5 pages
Module 2 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 2 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
20 pages
Unit Online 1.4
No ratings yet
Unit Online 1.4
132 pages
Chapter5 Regularization Summary Final
No ratings yet
Chapter5 Regularization Summary Final
10 pages
Understanding_Overfitting_in_Machine_Learning_Models
No ratings yet
Understanding_Overfitting_in_Machine_Learning_Models
2 pages
Overfitting and Underfitting
No ratings yet
Overfitting and Underfitting
3 pages
mod4
No ratings yet
mod4
65 pages
An Overview of Overfitting and Its Solutions
No ratings yet
An Overview of Overfitting and Its Solutions
7 pages
unit4
No ratings yet
unit4
93 pages
An Overview of Overfitting and Its Solutions
No ratings yet
An Overview of Overfitting and Its Solutions
7 pages
CMPE257 - W2C3 - ML Fundamentals_ Part 2
No ratings yet
CMPE257 - W2C3 - ML Fundamentals_ Part 2
34 pages
ANN_Presentation_Exam_Hafsa
No ratings yet
ANN_Presentation_Exam_Hafsa
29 pages
Overfitting and Solution Sovlve
No ratings yet
Overfitting and Solution Sovlve
3 pages
PA 4 UNIT
No ratings yet
PA 4 UNIT
33 pages
5m DL answers
No ratings yet
5m DL answers
12 pages
ML _ Underfitting and Overfitting - GeeksforGeeks
No ratings yet
ML _ Underfitting and Overfitting - GeeksforGeeks
8 pages
All DL
No ratings yet
All DL
72 pages
Questions
No ratings yet
Questions
8 pages
unit 2 (1)
No ratings yet
unit 2 (1)
23 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
116 pages
6 CNN
No ratings yet
6 CNN
50 pages
What is Regularization.
No ratings yet
What is Regularization.
10 pages
Lec-1 Bias-variance-Tradeoff
No ratings yet
Lec-1 Bias-variance-Tradeoff
24 pages
emsemble methods-pages-deleted
No ratings yet
emsemble methods-pages-deleted
2 pages
Underfitting and Overfitting in Machine Learning by ROll (41,42)
No ratings yet
Underfitting and Overfitting in Machine Learning by ROll (41,42)
29 pages
WEEK 10
No ratings yet
WEEK 10
69 pages
Underfitting & Overfitting
No ratings yet
Underfitting & Overfitting
13 pages
5.3 Model
No ratings yet
5.3 Model
26 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
ML & DL
No ratings yet
ML & DL
19 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
5 pages
9-Deep Neural Networks - Forward and Back Propagation-01-08-2024
No ratings yet
9-Deep Neural Networks - Forward and Back Propagation-01-08-2024
10 pages
7-Activation Functions - Gradient Descent - Back Propagation-31-07-2024
No ratings yet
7-Activation Functions - Gradient Descent - Back Propagation-31-07-2024
16 pages
12-Mini-Batch Gradient Descent - Exponential Weighted Averages-07-08-2024
No ratings yet
12-Mini-Batch Gradient Descent - Exponential Weighted Averages-07-08-2024
2 pages
8-Activation Functions - Gradient Descent - Back Propagation-31-07-2024
No ratings yet
8-Activation Functions - Gradient Descent - Back Propagation-31-07-2024
9 pages
AI_in_architecture_and_engineering_from
No ratings yet
AI_in_architecture_and_engineering_from
21 pages
Quiz 4 5 6
No ratings yet
Quiz 4 5 6
11 pages
Sign Language Detection: Bachelor of Technology IN Electronics and Communication Engineering
No ratings yet
Sign Language Detection: Bachelor of Technology IN Electronics and Communication Engineering
34 pages
Introduction To Artificial Learning Lecture One
No ratings yet
Introduction To Artificial Learning Lecture One
16 pages
Personal Statement
No ratings yet
Personal Statement
2 pages
A Bibliometric View of AI Ethics Development
No ratings yet
A Bibliometric View of AI Ethics Development
5 pages
Ihub - IITR - PCP in Generative AI and Machine Learning - 41223
No ratings yet
Ihub - IITR - PCP in Generative AI and Machine Learning - 41223
30 pages
He Deep Residual Learning 2016 CVPR Supplemental
No ratings yet
He Deep Residual Learning 2016 CVPR Supplemental
4 pages
NN-BNU3
No ratings yet
NN-BNU3
42 pages
Sample Question
100% (1)
Sample Question
4 pages
Intro To Machine Learning 101 Python Data Science v2
No ratings yet
Intro To Machine Learning 101 Python Data Science v2
101 pages
Campus Drive 2025 - CU - Assessment Cleared List
No ratings yet
Campus Drive 2025 - CU - Assessment Cleared List
11 pages
Deep Learning - IIT Ropar - - Unit 9 - Week 6
No ratings yet
Deep Learning - IIT Ropar - - Unit 9 - Week 6
5 pages
SVM Algorithm
No ratings yet
SVM Algorithm
17 pages
Quantum-neural-network-for-genomic-pattern-detection
No ratings yet
Quantum-neural-network-for-genomic-pattern-detection
11 pages
Docs Slides Lecture1
No ratings yet
Docs Slides Lecture1
31 pages
Sugarcane diseases
No ratings yet
Sugarcane diseases
4 pages
Vision Mamba: Rethinking Visual Representation With Bidirectional LSTMs
No ratings yet
Vision Mamba: Rethinking Visual Representation With Bidirectional LSTMs
7 pages
ChatGPT - Latest News and Chat About AI 2
No ratings yet
ChatGPT - Latest News and Chat About AI 2
2 pages
Syllabus for BE 5210 202510 Brain-Computer Interfaces
No ratings yet
Syllabus for BE 5210 202510 Brain-Computer Interfaces
4 pages
(2020129) On Layer Normalization in The Transformer Architecture
No ratings yet
(2020129) On Layer Normalization in The Transformer Architecture
17 pages
Mind Vs Machine
No ratings yet
Mind Vs Machine
10 pages
All Are Worth Words: A Vit Backbone For Diffusion Models: Long Skip Connection
No ratings yet
All Are Worth Words: A Vit Backbone For Diffusion Models: Long Skip Connection
21 pages
Week 10
No ratings yet
Week 10
31 pages
Recommendation Systems Using Nearest Neighbors
No ratings yet
Recommendation Systems Using Nearest Neighbors
8 pages
Agentic AI Projects
50% (2)
Agentic AI Projects
9 pages
Unit 3 Chapter 1 RNN
No ratings yet
Unit 3 Chapter 1 RNN
121 pages
Cbse - Department of Skill Education: Artificial Intelligence (Subject Code - 417)
No ratings yet
Cbse - Department of Skill Education: Artificial Intelligence (Subject Code - 417)
8 pages