L09-10 DL and CNN
L09-10 DL and CNN
Preprocess Data
Clean Data SIFT/SURF
Hand Craft Features
Obtain Data
Deep Neural Networks: Why now?
• Data, Data, Data
• ImageNet (14,197,122 images, http://www.image-net.org/)
• AlexNet[7] achieved a top-5 error of 15.3% in the ImageNet 2012
• More than 10.8 percentage points lower than that of the runner up
• GPU Accelerated Computation
• Smart People
DNN: Why is it exciting?
• Deep-learning networks perform automatic feature extraction
without human intervention, unlike most traditional machine-
learning algorithms.
• Given that feature extraction is a task that can take teams of
data scientists years to accomplish, deep learning is a way to
circumvent the chokepoint of limited experts.
• It augments the powers of small data science teams, which by
their nature do not scale.
DNN: “New” Pipeline
Domain Experts
Computer Vision
DeepLearning
I’m a Feature
Engineer
Blackbox
Preprocess Data
Clean Data SIFT/SURF
Hand Craft Features
Obtain Data
ML Champion
Deep Neural Networks: Champions
Vanishing and Exploding Gradients
• Vanishing Gradient
• Error travels from the output layer towards the input layer.
• The gradients often get smaller and smaller and approach zero.
• Eventually leaves the weights of the initial or lower layers nearly
unchanged.
• As a result, the gradient descent never converges to the optimum
• Gradient Explosion
• Error gradients can accumulate during an update and result in very
large gradients
• result in large updates to the network weights
• in turn, an unstable network
https://www.analyticsvidhya.com/blog/2021/06/the-challenge-of-vanishing-exploding-gradients-in-deep-neural-networks/
No gradients to propagate
activation=tf.nn.relu
Regularization: Drop-out
• Avoids overspecialisation
• Not a real “layer”
• Randomly chooses percentage of
neurons on the preceding layer
• Temporarily disconnects their
inputs and outputs
• Removed from
• forward pass,
• backpropgation
• optimiser
from tensorflow.keras.layers import BatchNormalization
Regularization: BatchNorm
• Batch Normalisation
• Activation functions work best
within a small range around 0
• batchnorm does by scaling
and shifting all the outputs of
a layer together
• learns the parameters for this
scaling and shifting
CNN
CNN: Layer Architecture
a) 2D Filter (weights)
b) Random Image
c) Image convolved with Filter
d) Threshold to maximum filter value
e) Highlighted maximum values and surrounds
https://colab.research.google.com/drive/1ZAmIKkU-enU8YDY4d3PyS1Xn0Mp-gqax?usp=sharing
• https://setosa.io/ev/image-kernels/
Basics of a CNN operation
• The type of neighbourhood processing in a CNN is spatial convolution
• Computes a sum of products between pixels and a set of kernel weights
• At every spatial location in the input image
• The result at each (x,y) is a scalar value
• This scalar value is the output of a neuron
• Adding a bias passing the result through an activation function
• we have our good old NN!
https://cs231n.github.io/convolutional-networks/
CNN: Convolution
• Neighbourhoods Receptive Fields (RF)
• The receptive fields move over the image executing convolution
• The set of weights, arranged as a receptive field, is a kernel
• Number of spatial increments of RF: strides
• To each convolution value we add a bias
• Then pass the result through an activation function to generate a single value
• This value is fed to the corresponding location in the input of next layer
• This is repeated to all locations in the input image, resulting in a 2D set of values stored in the next layer
as a 2D array called feature map
• the role of convolution here is to extract features, such as edges, points, blobs
• Convolutional layer:
• three features maps, obtained from three distinct kernels!
• After convolution and activation:
• Subsampling (or pooling):
• Produces pooled features maps: Pooling Layer
• Reduction in spatial resolution:
• responsible for translational invariance
• Reduces the volume of data
• Done by subdividing the feature maps into a set of small (typically 2x2) regions:
• Pooling neighbourhoods
• Replacing all the values of that neighbourhood by a single value
• Common pooling methods:
• Average pooling: substitute by the average
• Max-pooling: substitute by the max value
• L2 pooling: substitute by the square root of the sum
https://cs231n.github.io/convolutional-networks/
Vector
5
9
Teaching a CNN to recognise simple images
Teaching a CNN to recognise simple images
https://jeffmacaluso.github.io/post/DeepLearningRulesOfThumb/
https://towardsdatascience.com/17-rules-of-thumb-for-building-a-neural-network-93356f9930af
Deep Learning: When to use?
• Excels in tasks where the basic unit (pixel, word) has very little
meaning in itself, but their combination has a useful meaning
Data Augmentation
• Input images can be cropped, rotated, or rescaled to create new
examples with the same labels as the original training set
https://colab.research.google.com/drive/1dLVBk9E94tOLc5BdN3ClZ2tEPUHe2TMt?usp=sharing
https://keras.io/guides/transfer_learning/ https://cs231n.github.io/transfer-learning/
Transfer Learning
• Big networks needs lots of data and lots
of compute power
• Free Google Colab is not going to work!
• Transfer learning is using previously
trained models as
• a starting point for further refinement
and/or
• a front-end feature extractor for a classifier
(with a little refinement)
Transfer Learning
• The most common transfer learning workflow:
1. Take layers from a previously trained model.
2. Freeze them, so as to avoid destroying any of the information they
contain during future training rounds.
3. Add some new, trainable layers on top of the frozen layers. They will
learn to turn the old features into predictions on a new dataset.
4. Train the new layers on your dataset.
Transfer Learning – VGG16
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
# set the shape to the CIFAR-10 image size and the number of classes
input_shape = (32, 32, 3)
classes = 10
# load the VGG16 model with the imagenet weights, but without the final 1000 output layer
base_model = VGG16(
weights="imagenet", # Load weights pre-trained on ImageNet.
input_shape=input_shape,
include_top=False, # Do not include the ImageNet classifier at the top.
)
# Freeze the base_model
base_model.trainable = False
include_top=False
Removes these layers
Transfer Learning
vgg16
epochs = 5
model.fit(X_train,y_train, epochs=epochs)
Transfer Learning – Fine-tuning
vgg16