One Fourth Labs: L2 Regularization

L2 regularization adds a regularization term to the loss function to prevent weights from growing too large during training. This regularization term is the sum of squares of all weights. By minimizing both the training loss and regularization loss, L2 regularization finds a balanced solution where the training loss is minimized while keeping weights small. This helps reduce overfitting. The gradient descent algorithm is modified such that the update for each weight factor in both its contribution to the training loss and regularization loss.

Uploaded by

ashwin

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views

One Fourth Labs: L2 Regularization

Uploaded by

ashwin

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

PadhAI: Regularization

One Fourth Labs

L2 regularization
What is the intuition behind L-2 regularization?
1. Consider the error curves for training and test set

N
ˆ ))2
2. In the case of Square error loss: Ltrain (θ) = ∑ (y i − f(xi
i=1
a. Where θ = [W 111 , W 112 , + ... + W Lnk ]
b. Our aim has been to minimise the loss function minθ L(θ)
3. Now, imagine if we include a new term in the minimization condition minθ L(θ) = Ltrain (θ) + Ω(θ)
a. Here, in addition to minimising the training loss, we are also minimising some other quantity
that is dependent on our parameters
b. In the case of L2 Regularisation, Ω(θ) = ||θ||2 2 (sq.root of the sum of the squares of the
weight)
c. Ω(θ) = W 2 111 + W 2 112 + ... + W 2 Lnk
d. Here, we should aim to minimize both Ltrain (θ) and Ω(θ) , it wouldn’t make sense for either of
them to be high values.
4. What if we set all weights to 0? In this case, the model would not have learned much, therefore
Ltrain (θ) would be high.
5. What if we try to minimise Ltrain (θ) to 0? In this case, it is possible that some of the weights would
take on large values, thereby driving the value of Ω(θ) high.
6. To counter the previous point’s shortcoming, we need to minimize Ltrain (θ) but shouldn’t allow
the weights to grow too large
7. Thus, as shown in the figure, in L2 Regularisation, we do not allow the training loss to be brought
to be zero, instead we maintain it at slightly above zero, so that Ω(θ) doesn’t become too high
8. This works in the Gradient Descent Algorithm as well

PadhAI: Regularization
One Fourth Labs

9. The algorithm
a. Initialise: w111, w112, … w313, b1, b2, b3 randomly
b. Iterate over data
i. Compute ŷ
ii. Compute L(w,b) Cross-entropy loss function
iii. w111 = w111 - η𝚫w111
iv. w112 = w112 - η𝚫w112
…
v. w313 = w111 - η𝚫w313
c. Till satisfied
∂L(θ)
10. The derivative of the loss function w.r.t any weight is ΔW ijk = ∂W ijk
∂Ltrain (θ) ∂Ω(θ)
11. In the case of L2 Regularisation, that value would be ΔW ijk = ∂W ijk + ∂W ijk
12. Here, the derivative of the regularisation term will cancel out all other weights except the
∂Ω(θ)
concerned weight and we will compute its derivative. I.e. ∂W = 2W ijk
ijk
∂Ltrain (θ)
13. So the new derivative term will be ΔW ijk = ∂W ijk + 2W ijk
14. This process is automatically done in PyTorch.

DATA STRUCTURE AND ALGORITHMS Notes
100% (1)
DATA STRUCTURE AND ALGORITHMS Notes
74 pages
Regularization: Swetha V, Research Scholar
No ratings yet
Regularization: Swetha V, Research Scholar
32 pages
Regularization Induces Sparse Coefficients
No ratings yet
Regularization Induces Sparse Coefficients
2 pages
L1 Regularization (Lasso) & L2 Regularization (Ridge)
No ratings yet
L1 Regularization (Lasso) & L2 Regularization (Ridge)
4 pages
Module-4_3
No ratings yet
Module-4_3
20 pages
07_regularization
No ratings yet
07_regularization
51 pages
Lec 05 Regularization
No ratings yet
Lec 05 Regularization
77 pages
Regularization
No ratings yet
Regularization
46 pages
Lec9_10 (1)
No ratings yet
Lec9_10 (1)
4 pages
Unit 2.3
No ratings yet
Unit 2.3
43 pages
Deep Learning Basics Lecture 4 Regularization II
No ratings yet
Deep Learning Basics Lecture 4 Regularization II
27 pages
03 Reg Slides
No ratings yet
03 Reg Slides
64 pages
DL Chpter 3
No ratings yet
DL Chpter 3
8 pages
ML Lec-8
No ratings yet
ML Lec-8
7 pages
NN&DL Unit-IV Regularization for Deep Learning
No ratings yet
NN&DL Unit-IV Regularization for Deep Learning
16 pages
4th Unit DL Final Class Notes (1)
No ratings yet
4th Unit DL Final Class Notes (1)
68 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
L1L2_regularization_comparison
No ratings yet
L1L2_regularization_comparison
5 pages
unit4
No ratings yet
unit4
93 pages
UNIT LV
No ratings yet
UNIT LV
8 pages
DL_Unit-3
No ratings yet
DL_Unit-3
56 pages
5-Introduction To regularization-03-Aug-2020Material - I - 03-Aug-2020 - Module3 - Regularization
No ratings yet
5-Introduction To regularization-03-Aug-2020Material - I - 03-Aug-2020 - Module3 - Regularization
10 pages
Module - 2 Ver 1.4
No ratings yet
Module - 2 Ver 1.4
35 pages
Roxana Rodriguez HW1
No ratings yet
Roxana Rodriguez HW1
3 pages
BACK PROPAGATION and REGULATION, BATCH NORMALIZATION
No ratings yet
BACK PROPAGATION and REGULATION, BATCH NORMALIZATION
20 pages
What is Regularization.
No ratings yet
What is Regularization.
10 pages
Deep Learning Basics Lecture 3 Regularization I
No ratings yet
Deep Learning Basics Lecture 3 Regularization I
32 pages
Overfitting Underfitting: UNIT 2: Optimization and Regularization in Neural Networks
No ratings yet
Overfitting Underfitting: UNIT 2: Optimization and Regularization in Neural Networks
18 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
17 pages
Lecture6 Regularization
No ratings yet
Lecture6 Regularization
56 pages
Bias Variance
No ratings yet
Bias Variance
3 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
3 pages
UNIT IV NNHDL
No ratings yet
UNIT IV NNHDL
15 pages
Unit 2
No ratings yet
Unit 2
31 pages
Unit 4
No ratings yet
Unit 4
62 pages
Lecture15 Regularization
No ratings yet
Lecture15 Regularization
47 pages
lecture_3-4
No ratings yet
lecture_3-4
115 pages
S10_DNN_Regularization_wip
No ratings yet
S10_DNN_Regularization_wip
11 pages
DL Unit 4
No ratings yet
DL Unit 4
15 pages
Regularization_for_Neural_Networks_1718966083
No ratings yet
Regularization_for_Neural_Networks_1718966083
9 pages
Regularization: Updates To Assignment
No ratings yet
Regularization: Updates To Assignment
21 pages
Addendum Bias Variance
No ratings yet
Addendum Bias Variance
3 pages
02 - Linear Models - C - Regularization - Logistic - Regression
No ratings yet
02 - Linear Models - C - Regularization - Logistic - Regression
16 pages
Regularization For Deep Learning: Tsz-Chiu Au Chiu@unist - Ac.kr
No ratings yet
Regularization For Deep Learning: Tsz-Chiu Au Chiu@unist - Ac.kr
100 pages
Overfitting vs Underfitting
No ratings yet
Overfitting vs Underfitting
16 pages
Deep Feedforward Networks and Regularization: Licheng Zhang
No ratings yet
Deep Feedforward Networks and Regularization: Licheng Zhang
56 pages
WEEK 10
No ratings yet
WEEK 10
69 pages
Unit4 DL Final
No ratings yet
Unit4 DL Final
30 pages
Kkk
No ratings yet
Kkk
17 pages
Regularization in Deep Learning (1)
No ratings yet
Regularization in Deep Learning (1)
49 pages
Lecture 1.5-1.6
No ratings yet
Lecture 1.5-1.6
23 pages
DL Class3
No ratings yet
DL Class3
28 pages
5 Regularization
No ratings yet
5 Regularization
79 pages
05 AIS302 ANN-Optimization
No ratings yet
05 AIS302 ANN-Optimization
44 pages
Machine Learning by Tom Mitchell - Definitions
No ratings yet
Machine Learning by Tom Mitchell - Definitions
12 pages
Deep Learning Notes-2
No ratings yet
Deep Learning Notes-2
16 pages
mod4
No ratings yet
mod4
65 pages
Underfitting Overfitting
No ratings yet
Underfitting Overfitting
38 pages
CM20315 09 Regularization
No ratings yet
CM20315 09 Regularization
44 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
(4th Year) Roadmap To Dream Placement
No ratings yet
(4th Year) Roadmap To Dream Placement
1 page
E Cient Constant Coe Cient Multiplication Using Advanced FPGA Architectures
No ratings yet
E Cient Constant Coe Cient Multiplication Using Advanced FPGA Architectures
10 pages
Cla 16bit RTL
No ratings yet
Cla 16bit RTL
1 page
Appl No:734826921 Dt:17-02-2021
No ratings yet
Appl No:734826921 Dt:17-02-2021
1 page
Design Approach of Visual Image Detection in Rescue Robot System of Urban Search Using Bayesian's Logical Algorithm
No ratings yet
Design Approach of Visual Image Detection in Rescue Robot System of Urban Search Using Bayesian's Logical Algorithm
5 pages
Lesson+41 +Ensemble+Methods
No ratings yet
Lesson+41 +Ensemble+Methods
1 page
Lesson+60 +overfitting+in+deep+neural+networks
No ratings yet
Lesson+60 +overfitting+in+deep+neural+networks
1 page
One Fourth Labs: Test Error Due To High Bias and High Variance
No ratings yet
One Fourth Labs: Test Error Due To High Bias and High Variance
2 pages
The Data Link Layer: Dr. Ajay Singh Raghuvanshi
No ratings yet
The Data Link Layer: Dr. Ajay Singh Raghuvanshi
29 pages
One Fourth Labs: Bias and Variance
No ratings yet
One Fourth Labs: Bias and Variance
2 pages
Peer-To-Peer Protocols: Dr. Ajay Singh Raghuvanshi
No ratings yet
Peer-To-Peer Protocols: Dr. Ajay Singh Raghuvanshi
15 pages
Verbs Types Tenses Time
0% (1)
Verbs Types Tenses Time
14 pages
Problem Statements : Department of Electrical Engineering
0% (1)
Problem Statements : Department of Electrical Engineering
2 pages
Parent AFD 8673580 218201817230885 PDF
No ratings yet
Parent AFD 8673580 218201817230885 PDF
1 page
Food of Maharashtra
100% (1)
Food of Maharashtra
22 pages
Optimizing Random Forests: Spark Implementations of Random Genetic Forests
No ratings yet
Optimizing Random Forests: Spark Implementations of Random Genetic Forests
10 pages
Lecture 18 - Kohonen SOM
No ratings yet
Lecture 18 - Kohonen SOM
17 pages
Gilbert-Johnson-Keerthi Distance Algorithm: Convex Sets Support Function Simplices Minkowski Difference
No ratings yet
Gilbert-Johnson-Keerthi Distance Algorithm: Convex Sets Support Function Simplices Minkowski Difference
2 pages
Data Structures & Algorithms: Resource Person: Zafar Mehmood Khattak
No ratings yet
Data Structures & Algorithms: Resource Person: Zafar Mehmood Khattak
12 pages
Dsa Lab Questions
No ratings yet
Dsa Lab Questions
37 pages
1608881416205numerical Solution of Algebraic Transcendental Equations 1
No ratings yet
1608881416205numerical Solution of Algebraic Transcendental Equations 1
20 pages
LECTURE-12 Disk Scheduling
No ratings yet
LECTURE-12 Disk Scheduling
44 pages
SJF Scheduling - SRTF - CPU Scheduling - Gate Vidyalay
No ratings yet
SJF Scheduling - SRTF - CPU Scheduling - Gate Vidyalay
11 pages
Queues in Python
No ratings yet
Queues in Python
14 pages
Lec3 Strings Algos PDF
No ratings yet
Lec3 Strings Algos PDF
32 pages
0286 Cse3004 Tee
No ratings yet
0286 Cse3004 Tee
2 pages
Lab06 Stack Queue
No ratings yet
Lab06 Stack Queue
6 pages
Lec 11 Single Layer Perceptron
No ratings yet
Lec 11 Single Layer Perceptron
10 pages
MAD Blooms Taxonomy Question Paper Format
No ratings yet
MAD Blooms Taxonomy Question Paper Format
3 pages
UNEC__1710175479
100% (1)
UNEC__1710175479
25 pages
HW 3
No ratings yet
HW 3
2 pages
Teams - 15 Marks Que
No ratings yet
Teams - 15 Marks Que
3 pages
Data Structures & Algorithm Design: Trees
No ratings yet
Data Structures & Algorithm Design: Trees
38 pages
PDF Document
No ratings yet
PDF Document
17 pages
AVL, B and B+ Trees
No ratings yet
AVL, B and B+ Trees
20 pages
Algorithmic Problem Solving
No ratings yet
Algorithmic Problem Solving
12 pages
Unit - II: Assignment Problems Hungarian Method
No ratings yet
Unit - II: Assignment Problems Hungarian Method
41 pages
AD3251 Data Structures Design Question Bank 1
No ratings yet
AD3251 Data Structures Design Question Bank 1
1 page
Convex Hull 3d
No ratings yet
Convex Hull 3d
12 pages
Ref For MLP
No ratings yet
Ref For MLP
2 pages
Viva Questions and Answers
No ratings yet
Viva Questions and Answers
9 pages
LinkedList in Java
No ratings yet
LinkedList in Java
6 pages
HPC 1 BFS
No ratings yet
HPC 1 BFS
10 pages
Local Search and Optimization Problems
No ratings yet
Local Search and Optimization Problems
10 pages

One Fourth Labs: L2 Regularization

Uploaded by

One Fourth Labs: L2 Regularization

Uploaded by

PadhAI: Regularization

One Fourth Labs

You might also like