ppt3dl

The document discusses the importance of normalizing data sets in deep learning to improve training speed and efficiency by ensuring features are on a similar scale. It covers techniques such as subtracting the mean and normalizing variance, as well as addressing issues like vanishing and exploding gradients. Additionally, it introduces batch normalization and hyperparameter tuning strategies to enhance model performance.

Uploaded by

kushalgangwar98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

ppt3dl

Uploaded by

kushalgangwar98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Deep Learning

Experiments
Lecture 15

22 January 2025 1
Normalizing Data Sets

22 January 2025 2
Speed up the training/ Why normalization
Use same normalizer in the test set also, exactly in the same way as
training set

If the features are on different scale 1, 1000 and 0,1 weights will end
up taking very different values

More steps may be needed to reach the optimal value and the
learning can be slow.

Shape of the Normalized bowl will be more spherical and

symmetrical making it easier to faster to optimize

22 January 2025 3
Normalizing Training Sets
Subtract Mean

1 𝑚 (𝑖)
𝜇 = Σ𝑙=1 𝑥
𝑚
𝑥 =𝑥−𝜇

Normalize Variance

2
1 𝑚 (𝑖)
𝜎 = Σ𝑖=1 𝑥 ∗∗ 2
𝑚
𝑥/= 𝜎 2
22 January 2025 4
Vanishing/exploding gradients
g(z) = z # A linear function b[l]=0
𝑦ො = 𝑤 [𝑙] 𝑤 [𝑙−1] 𝑤 [𝑙−2] … 𝑤 3 𝑤 2 𝑤 1 𝑥

1.5 0 .5 0 The matrix will be multiplied by l-1 (as w[l]

0 1.5 0 .5 will be different dimension) leading to
exploding and vanishing gradients

22 January 2025 5
Exploding/vanishing gradients
Gradients/slope becoming too small or two large

So it is very important to see that how we initialize our weights.

If the value of features are large than weights needs to very small

It has been proposed to have the variance between the weights to

be 2/n

22 January 2025 6
Batch Norm
It is an extension of normalizing inputs and applies to every layer of the neural
network
Given some intermediate values in Neural Network
1
• 𝜇= σ 𝑧 (𝑖)
𝑚
1
• 𝜎 2 = σ(𝑧 − 𝜇)2
𝑚
𝑖 𝑧 (𝑖) −𝜇
• 𝑧𝑛𝑜𝑟𝑚 = 2
𝜎 +∈
𝑖
• 𝑧෥𝑖 = 𝛾𝑧𝑛𝑜𝑟𝑚 +𝛽 where 𝛾 and 𝛽 are learnable parameters
• If j = 𝜎 2 +∈ and 𝛽= 𝜇 then 𝑧෥𝑖 = 𝑧 (𝑖)

22 January 2025 7
Applying Batch Norm
1 [ ] 1 [ ]
1 1
𝑤 ,𝑏 𝛽 ,𝛾 [1 ]
•X z[1] 𝑧෥𝑖 =a =g (𝑧෦
[1] [1]

• tf.nn.batch-normalization
• Each mini-batch is scaled by the mean/variance computed on just that mini-
batch
• This adds some noise to the values of z within that minibatch. Similar to
dropout it has some regularization effect, as it adds to hidden layers
activations.
22 January 2025 8
Why batch norm

That actually means that it provides

Applying it on earlier layers helps in a robustness to the changes in the
decoupling from the later layers. covariance shift due to change in the
input distribution.

For test data we should prefer taking

So if there are frequent changes in the exponentially weighted averages
the input examples than it will for 𝜇 and 𝜎 2 of subsequent layers
provide a cushion for the effect to during training that will be better
taper off while going to later layers. than the 𝜇 and 𝜎 2 values of the
training set itself.
22 January 2025 9
First focus on Most
important ones and
How to try Hyperparameters then the lesser ones
in the sequence

Learning
Learning Hidden Mini Number
Beta Decay Others
Rate units batch size of Layers
Rate

22 January 2025 10
Random is better than a Grid
𝜶
* * * * * * *

* * * * * * *

* * * * * * * 𝜷

22 January 2025 11
Coarser to finer

22 January 2025 12
Picking hyperparameter at random and as per
scale
• Is it ok to use the actual scale for all parameters or in some cases we
require the log scale

0.0001 1

22 January 2025 13
Babysitting one model Vs Parallel model
training
Panda or Caviar Approach

Depends upon the applications, resources and the time you

have.
In babysitting we painstakingly try to observe and introduce
mid-path corrections.
In Parallel model training we simultaneously try to train with
different models
22 January 2025 14
Thank You
For more information, please visit the
following links:

[email protected]
[email protected]
https://www.linkedin.com/in/gauravsingal789/
http://www.gauravsingal.in

22 January 2025
15

Inventory
No ratings yet
Inventory
82 pages
Multi Strategy EA
No ratings yet
Multi Strategy EA
5 pages
Soluciones Al Hartshorne PDF
No ratings yet
Soluciones Al Hartshorne PDF
51 pages
cours6
No ratings yet
cours6
26 pages
Batch Normalization Separate
No ratings yet
Batch Normalization Separate
20 pages
BN Layer
No ratings yet
BN Layer
4 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
6 Batchnorm
No ratings yet
6 Batchnorm
30 pages
Batch Normalization
No ratings yet
Batch Normalization
6 pages
Batch Normalization
No ratings yet
Batch Normalization
11 pages
Normalization Techniques
No ratings yet
Normalization Techniques
23 pages
Batch Norm
No ratings yet
Batch Norm
7 pages
Notes For - Batch Normalization - Accelerating Deep Network Training by Reducing Internal Covariate Shift - Paper GitHub
No ratings yet
Notes For - Batch Normalization - Accelerating Deep Network Training by Reducing Internal Covariate Shift - Paper GitHub
3 pages
Optimization of Deep Networks
No ratings yet
Optimization of Deep Networks
84 pages
dis4-sol
No ratings yet
dis4-sol
10 pages
DL Unit-3
No ratings yet
DL Unit-3
10 pages
Lecture_2
No ratings yet
Lecture_2
31 pages
cours4
No ratings yet
cours4
30 pages
20-1135
No ratings yet
20-1135
41 pages
How To Use Batch Normalization With TensorFlow and TF - Keras To Train Deep Neural Networks Faster
No ratings yet
How To Use Batch Normalization With TensorFlow and TF - Keras To Train Deep Neural Networks Faster
11 pages
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
No ratings yet
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
1 page
WEEK 10
No ratings yet
WEEK 10
69 pages
Batch Normalisation
No ratings yet
Batch Normalisation
17 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
A Probabilistic Theory of Deep Learning: Unit 2
No ratings yet
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
Unit-2 Improving-Deep-Neural-Networks
No ratings yet
Unit-2 Improving-Deep-Neural-Networks
18 pages
18 DL Regularization
No ratings yet
18 DL Regularization
41 pages
L4 Training Neural Networks en
No ratings yet
L4 Training Neural Networks en
48 pages
Batch Normalization in AIML Accelerating Deep Learning (3)
No ratings yet
Batch Normalization in AIML Accelerating Deep Learning (3)
12 pages
Chapter 2 - 4 Important Techniques
No ratings yet
Chapter 2 - 4 Important Techniques
34 pages
Lecture 5-6
No ratings yet
Lecture 5-6
45 pages
Training Deep Neural Networks
No ratings yet
Training Deep Neural Networks
55 pages
Unit-2 L2 (3)
No ratings yet
Unit-2 L2 (3)
22 pages
Exponential Convergence Rates For Batch Normalization - 1
No ratings yet
Exponential Convergence Rates For Batch Normalization - 1
1 page
8.TrainingNN-3
No ratings yet
8.TrainingNN-3
67 pages
7 CNN 3
No ratings yet
7 CNN 3
30 pages
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
No ratings yet
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
75 pages
批处理标准化如何帮助优化（李宏毅教授视频推荐）
No ratings yet
批处理标准化如何帮助优化（李宏毅教授视频推荐）
26 pages
L10 Learning II Gradient Based Learning
No ratings yet
L10 Learning II Gradient Based Learning
72 pages
Chap 2 Training Feed Forward Neural Networks
No ratings yet
Chap 2 Training Feed Forward Neural Networks
22 pages
Neural Network Module 2 Notes
100% (1)
Neural Network Module 2 Notes
72 pages
Layer Normalization: Jimmy@psi - Toronto.edu Rkiros@cs - Toronto.edu Hinton@cs - Toronto.edu
No ratings yet
Layer Normalization: Jimmy@psi - Toronto.edu Rkiros@cs - Toronto.edu Hinton@cs - Toronto.edu
14 pages
ANN-Unit 7 - Parameter Tuning & Normalization
No ratings yet
ANN-Unit 7 - Parameter Tuning & Normalization
13 pages
Batch Normalization: Motivation
No ratings yet
Batch Normalization: Motivation
1 page
Chen, Deng et al 2021 - Effective and Efficient Batch Normalization
No ratings yet
Chen, Deng et al 2021 - Effective and Efficient Batch Normalization
15 pages
winter1516_lecture54
No ratings yet
winter1516_lecture54
20 pages
Dataset Augmentation
No ratings yet
Dataset Augmentation
30 pages
Deep MLP's
No ratings yet
Deep MLP's
44 pages
2. Deep Neural Network
No ratings yet
2. Deep Neural Network
60 pages
L12_optim__slides
No ratings yet
L12_optim__slides
55 pages
Boltzmann Learning
No ratings yet
Boltzmann Learning
47 pages
CNNs Pytorch
No ratings yet
CNNs Pytorch
19 pages
Lecture 8.7
No ratings yet
Lecture 8.7
9 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Lecture 6
No ratings yet
Lecture 6
41 pages
IoT - Lecture 11
No ratings yet
IoT - Lecture 11
58 pages
Deep Feedforward Networks and Regularization: Licheng Zhang
No ratings yet
Deep Feedforward Networks and Regularization: Licheng Zhang
56 pages
Dat 300
No ratings yet
Dat 300
12 pages
Lesson 5 Deep Neural Net Optimization Tuning Interpretability
100% (1)
Lesson 5 Deep Neural Net Optimization Tuning Interpretability
105 pages
SNGAN_5th_Module
No ratings yet
SNGAN_5th_Module
12 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
Lecture 02
No ratings yet
Lecture 02
147 pages
Gre Formula Book
From Everand
Gre Formula Book
Saifuddin Kamran
No ratings yet
CAM Lab Manual 2019 PDF
No ratings yet
CAM Lab Manual 2019 PDF
54 pages
NSC Maths Literacy Grade 12 May June 2023 P2 and Memo
No ratings yet
NSC Maths Literacy Grade 12 May June 2023 P2 and Memo
34 pages
Thesis Image Denoising
100% (3)
Thesis Image Denoising
5 pages
Prelim paper 1 (final) 2014
No ratings yet
Prelim paper 1 (final) 2014
24 pages
Mechanical Engineering
No ratings yet
Mechanical Engineering
161 pages
KMO Training Main
No ratings yet
KMO Training Main
14 pages
Top 50 Important Data Interpretation Questions For Sbi Po
No ratings yet
Top 50 Important Data Interpretation Questions For Sbi Po
25 pages
Relativistic Mechanics
No ratings yet
Relativistic Mechanics
7 pages
Design-Grid Structures and Syntax - Cde - Article
No ratings yet
Design-Grid Structures and Syntax - Cde - Article
8 pages
2 8 Dosage Regimen
No ratings yet
2 8 Dosage Regimen
75 pages
Soil Dynamics and Earthquake Engineering: W.D. Liam Finn
No ratings yet
Soil Dynamics and Earthquake Engineering: W.D. Liam Finn
9 pages
6.03 - Calorimetry Lesson Review: Answer Key
No ratings yet
6.03 - Calorimetry Lesson Review: Answer Key
5 pages
Model Perf Cheat Sheet
No ratings yet
Model Perf Cheat Sheet
2 pages
Physics Formula Ray Optics and Optical Instruments
No ratings yet
Physics Formula Ray Optics and Optical Instruments
20 pages
8th English
No ratings yet
8th English
29 pages
SAS Test
No ratings yet
SAS Test
6 pages
Variables left
No ratings yet
Variables left
7 pages
Assessment of Learning 2 - LOAs
No ratings yet
Assessment of Learning 2 - LOAs
5 pages
Tutorial 5
No ratings yet
Tutorial 5
6 pages
Catapult Lesson Plan
No ratings yet
Catapult Lesson Plan
2 pages
AD3271 DSD Lab Manual
No ratings yet
AD3271 DSD Lab Manual
81 pages
Math Observation Tool: Exemplary Acceptable Inadequate Nonexistent
No ratings yet
Math Observation Tool: Exemplary Acceptable Inadequate Nonexistent
4 pages
DE Question Bank-2
No ratings yet
DE Question Bank-2
4 pages
Kay - Solutions
100% (2)
Kay - Solutions
47 pages
PAC-2008-2009-Recapitulare ETABS
No ratings yet
PAC-2008-2009-Recapitulare ETABS
2 pages
Assignment Class X Areas Related To Circles Main1
100% (3)
Assignment Class X Areas Related To Circles Main1
2 pages
Speech Intelligibility in Theaters, M. Luykx - Isra2010 - Submission - 76
No ratings yet
Speech Intelligibility in Theaters, M. Luykx - Isra2010 - Submission - 76
8 pages