CNNs Pytorch
CNNs Pytorch
IALAB UC
Training Tricks
ReLUs
Dropout
Batch Normalization
PyTorch - in depth!
1
Dropout: A Simple Way to Prevent Neural Networks from Overfitting
http://jmlr.org/papers/v15/srivastava14a.html
IALAB UC Deep Learning I DCC 5 / 20
Training Tricks: Dropout
2
DropBlock: A regularization method for convolutional networks https://arxiv.org/abs/1810.12890
3
Regularization of Neural Networks using DropConnect
http://yann.lecun.com/exdb/publis/pdf/wan-icml-13.pdf
IALAB UC Deep Learning I DCC 8 / 20
Training Tricks: Batch Normalization4
4
https://arxiv.org/abs/1502.03167
IALAB UC Deep Learning I DCC 9 / 20
Batch Normalization (BN), Ioffe and Szegedy, 2015
x̄ (k ) − E[x̄ (k ) ]
x̂ (k ) = p (1)
Var [x̄ (k ) ]
Expectation and variance are computed over the mini-batch for
each dimension k . So each dimension is normalized
independently.
Therefore, BN normalizes each scalar feature independently
trying to make them unit gaussian, i.e., each dimension has zero
mean and unit variance.