0% found this document useful (0 votes)

37 views56 pages

L09-10 DL and CNN

The document discusses convolutional neural networks (CNNs) and their use in computer vision tasks. It describes how CNNs learn features directly from input images through multiple processing layers including convolution, activation, pooling and fully connected layers. CNNs have achieved human-level performance on image classification tasks due to availability of large datasets and use of graphics processing units for efficient training. The document provides examples and illustrations of common CNN architectures and their individual components.

Uploaded by

Paulo Santos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views56 pages

L09-10 DL and CNN

Uploaded by

Paulo Santos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 56

•Lecturer: Paulo Santos

COMP2712 – NNML •[email protected]

•Office: 4.24
Convolutional NN
Deep Convolutional Neural Networks (CNN)
• Up to this point: patterns were organised in terms of feature vectors
• The form of these features are specified by a human designer
• Extracted from the images prior to being input to the NN
• Convolutional Neural Networks:
• Accept images as inputs
• Learn the features as well as the classification
DNN: Classical Pipeline
Domain Experts
Computer Vision ML Assistant
Blackbox
SVM

Preprocess Data
Clean Data SIFT/SURF
Hand Craft Features

Obtain Data
Deep Neural Networks: Why now?
• Data, Data, Data
• ImageNet (14,197,122 images, http://www.image-net.org/)
• AlexNet[7] achieved a top-5 error of 15.3% in the ImageNet 2012
• More than 10.8 percentage points lower than that of the runner up
• GPU Accelerated Computation
• Smart People
DNN: Why is it exciting?
• Deep-learning networks perform automatic feature extraction
without human intervention, unlike most traditional machine-
learning algorithms.
• Given that feature extraction is a task that can take teams of
data scientists years to accomplish, deep learning is a way to
circumvent the chokepoint of limited experts.
• It augments the powers of small data science teams, which by
their nature do not scale.
DNN: “New” Pipeline
Domain Experts
Computer Vision
DeepLearning
I’m a Feature
Engineer
Blackbox

Preprocess Data
Clean Data SIFT/SURF
Hand Craft Features

Obtain Data

ML Champion
Deep Neural Networks: Champions
Vanishing and Exploding Gradients
• Vanishing Gradient
• Error travels from the output layer towards the input layer.
• The gradients often get smaller and smaller and approach zero.
• Eventually leaves the weights of the initial or lower layers nearly
unchanged.
• As a result, the gradient descent never converges to the optimum
• Gradient Explosion
• Error gradients can accumulate during an update and result in very
large gradients
• result in large updates to the network weights
• in turn, an unstable network
https://www.analyticsvidhya.com/blog/2021/06/the-challenge-of-vanishing-exploding-gradients-in-deep-neural-networks/

Vanishing and Exploding Gradients

• Vanishing Gradient
• Saturates at 0 or 1 with a derivative
All the fun happens very close to zero
here
• Backpropagation has no gradients to
propagate!

No gradients to propagate
activation=tf.nn.relu

Vanishing and Exploding Gradients

• Better, non-saturating activation functions
• ReLU and leaky ReLU

Rectified Linear Unit

from tensorflow.keras.layers import Dropout

Regularization: Drop-out
• Avoids overspecialisation
• Not a real “layer”
• Randomly chooses percentage of
neurons on the preceding layer
• Temporarily disconnects their
inputs and outputs
• Removed from
• forward pass,
• backpropgation
• optimiser
from tensorflow.keras.layers import BatchNormalization

Regularization: BatchNorm
• Batch Normalisation
• Activation functions work best
within a small range around 0
• batchnorm does by scaling
and shifting all the outputs of
a layer together
• learns the parameters for this
scaling and shifting
CNN
CNN: Layer Architecture

Deep neural networks + convolutions

Basic CNN architecture
http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf http://mlss.tuebingen.mpg.de/2015/slides/fergus/Fergus_1.pdf

CNN: Layer Architecture

LeCun (1998) Gradient-Based Learning Applied to Document Recognition

• There are four main operations

• Convolution
• Non-Linearity (ReLU)
• Pooling or Sub Sampling
• Classification (Fully Connected or Dense Layer)
https://cs231n.github.io/convolutional-networks/

CNN: Layer Architecture

The architecture shown here is a tiny VGG Net http://www.robots.ox.ac.uk/~vgg/research/very_deep/

pre-trained weights: https://keras.io/api/applications/

What is a convolution
Find a vertical white stripe up the centre

a) 2D Filter (weights)
b) Random Image
c) Image convolved with Filter
d) Threshold to maximum filter value
e) Highlighted maximum values and surrounds

https://colab.research.google.com/drive/1ZAmIKkU-enU8YDY4d3PyS1Xn0Mp-gqax?usp=sharing
• https://setosa.io/ev/image-kernels/
Basics of a CNN operation
• The type of neighbourhood processing in a CNN is spatial convolution
• Computes a sum of products between pixels and a set of kernel weights
• At every spatial location in the input image
• The result at each (x,y) is a scalar value
• This scalar value is the output of a neuron
• Adding a bias passing the result through an activation function
•  we have our good old NN!
https://cs231n.github.io/convolutional-networks/

CNN: Convolution
• Neighbourhoods  Receptive Fields (RF)
• The receptive fields move over the image executing convolution
• The set of weights, arranged as a receptive field, is a kernel
• Number of spatial increments of RF: strides
• To each convolution value we add a bias
• Then pass the result through an activation function to generate a single value
• This value is fed to the corresponding location in the input of next layer
• This is repeated to all locations in the input image, resulting in a 2D set of values stored in the next layer
as a 2D array called feature map
•  the role of convolution here is to extract features, such as edges, points, blobs
• Convolutional layer:
• three features maps, obtained from three distinct kernels!
• After convolution and activation:
• Subsampling (or pooling):
• Produces pooled features maps: Pooling Layer
• Reduction in spatial resolution:
• responsible for translational invariance
• Reduces the volume of data
• Done by subdividing the feature maps into a set of small (typically 2x2) regions:
• Pooling neighbourhoods
• Replacing all the values of that neighbourhood by a single value
• Common pooling methods:
• Average pooling: substitute by the average
• Max-pooling: substitute by the max value
• L2 pooling: substitute by the square root of the sum
https://cs231n.github.io/convolutional-networks/

CNN: Pooling or Sub-sampling

• Convolution:
• Filtered images
• Pooling:
• Filtered images of lower resolution
• The pooled filter maps in the first layer become the inputs to the next layer
• But we now have multiple pooled feature maps
• As convolution is a linear operation (remember assignment 1??)
• The values can be combined into a single one by superposition

• The ultimate goal is classification:

• The final pooled feature maps are fed into a Fully Connected Neural Net
• As we’ve seen before  the input should be vectorized.
Example

• Think of each element of a 2D array in the top row as a

neuron
• The outputs of these neurons are pixel values, creating
feature maps
• The neurons in the feature map of the 1st layer have
output values generated by convolving with the input
image a kernel, whose size and shape are the same as the
receptive field
• And whose coefficients are learned during training
• To each convolution value we add a bias and pass the
result through an activation function to generate the
output value of the corresponding neuron in the feature
map
• The output values of neurons in the pooled feature maps
are generated by pooling the output values of neurons in
the feature maps
• The kernel weights (shown as intensity values) are
learned from sample images using backpropagation
• Therefore, the nature of the learned features is determined
by the learned kernel coefficients
Graphical illustration of the functions
performed by the components of a CNN
Feature Pooled Feature Pooled Neural
maps feature maps feature net
maps maps
0

Vector
5

9
Teaching a CNN to recognise simple images
Teaching a CNN to recognise simple images

Training Image Set Test Image Set

CNN to recognise handwritten numerals
(MNIST dataset)
• 60,000 training images
• 10,000 test images
• Grayscale images of size 28x28 pixels
Architecture of the CNN trained to recognise
ten digits in the MNIST dataset
Kernels
Same architecture as before
Results of a forward pass
CNN: Visualisation
• http://www.cs.cmu.edu/~aharley/vis/
• http://www.cs.cmu.edu/~aharley/vis/conv/
Limitations
https://arxiv.org/pdf/1312.6199.pdf

CNN: Intriguing properties

correct diff ostrich

Deep Learning: Pros and Cons
Pros Cons
• Best performing method in • Need huge amount of data
many CV tasks • Hard to design and tune
• No need for hand-crafting • Difficult to analyse/understand
• Robust to natural variation • SVMs easy to deploy and get
• Many different applications good results
• Large-scale problems • Tend to learn everything
• Improves with more data
• Easy parallelization on GPUs
Deep Learning: Best Practices
• Check/clean your data
• Shuffle the training samples
• Split your data into training and testing samples
• Use validation data as well
• Never train on test data
• Start with an existing network and adapt it
• Start smallish, keep adding layers and nodes
• Check that you can achieve zero loss on a tiny subset

https://jeffmacaluso.github.io/post/DeepLearningRulesOfThumb/
https://towardsdatascience.com/17-rules-of-thumb-for-building-a-neural-network-93356f9930af
Deep Learning: When to use?

• You have large amount of data with good quality

• You are modelling image/audio/language/time-series data

• Excels in tasks where the basic unit (pixel, word) has very little
meaning in itself, but their combination has a useful meaning

• You need a model that is less reliant on handmade features and

instead can learn features from the data
The need for more data…
https://www.tensorflow.org/tutorials/images/data_augmentation

Data Augmentation
• Input images can be cropped, rotated, or rescaled to create new
examples with the same labels as the original training set

https://colab.research.google.com/drive/1dLVBk9E94tOLc5BdN3ClZ2tEPUHe2TMt?usp=sharing
https://keras.io/guides/transfer_learning/ https://cs231n.github.io/transfer-learning/

Transfer Learning
• Big networks needs lots of data and lots
of compute power
• Free Google Colab is not going to work!
• Transfer learning is using previously
trained models as
• a starting point for further refinement
and/or
• a front-end feature extractor for a classifier
(with a little refinement)
Transfer Learning
• The most common transfer learning workflow:
1. Take layers from a previously trained model.
2. Freeze them, so as to avoid destroying any of the information they
contain during future training rounds.
3. Add some new, trainable layers on top of the frozen layers. They will
learn to turn the old features into predictions on a new dataset.
4. Train the new layers on your dataset.
Transfer Learning – VGG16
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input

# set the shape to the CIFAR-10 image size and the number of classes
input_shape = (32, 32, 3)
classes = 10

# load the VGG16 model with the imagenet weights, but without the final 1000 output layer
base_model = VGG16(
weights="imagenet", # Load weights pre-trained on ImageNet.
input_shape=input_shape,
include_top=False, # Do not include the ImageNet classifier at the top.
)
# Freeze the base_model
base_model.trainable = False
include_top=False
Removes these layers
Transfer Learning

vgg16

Non-trainable params are part of vgg16

Transfer Learning – Fine Tuning
• The most common transfer learning workflow:
1. Take layers from a previously trained model.
2. Freeze them, so as to avoid destroying any of the information they
contain during future training rounds.
3. Add some new, trainable layers on top of the frozen layers. They will
learn to turn the old features into predictions on a new dataset.
4. Train the new layers on your dataset.
[optional]
5. Fine-tuning, which consists of unfreezing the entire model and re-
training it on the new data with a very low learning rate.
Transfer Learning – Fine-tuning
base_model.trainable = True Allow all weights to be trained
model.summary()

learning_rate = 1e-5 Small learning rate

model.compile( Must re-compile the model for trainable to take affect

optimizer=tf.optimizers.Adam(learning_rate=learning_rate),
loss=keras.losses.categorical_crossentropy,
metrics=keras.metrics.categorical_crossentropy
)

epochs = 5
model.fit(X_train,y_train, epochs=epochs)
Transfer Learning – Fine-tuning

vgg16

Vgg16 params now trainable!

Further important reading:
• Bias in Machine Learning
https://towardsdatascience.com/june-edition-bias-in-the-machine-994eadbccec2

CNN 2
No ratings yet
CNN 2
47 pages
What Is Convolutional Neural Network
No ratings yet
What Is Convolutional Neural Network
16 pages
3 # Deep Learning
No ratings yet
3 # Deep Learning
36 pages
UNIT2-CNN
No ratings yet
UNIT2-CNN
34 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
7 pages
What is a Convolutional Neural Network-unit3.docx
No ratings yet
What is a Convolutional Neural Network-unit3.docx
12 pages
Understanding of Convolutional Neural Network (CNN) — Deep Learning _ by Prabhu Raghav _ Medium
No ratings yet
Understanding of Convolutional Neural Network (CNN) — Deep Learning _ by Prabhu Raghav _ Medium
10 pages
Lec5 CNN RNN Attention
No ratings yet
Lec5 CNN RNN Attention
71 pages
CNN
No ratings yet
CNN
37 pages
Unit 4a - Convolutional Neural Networks
No ratings yet
Unit 4a - Convolutional Neural Networks
107 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
CNN2
No ratings yet
CNN2
70 pages
Chapter14 CNN
No ratings yet
Chapter14 CNN
54 pages
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
No ratings yet
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
55 pages
Lecture_3
No ratings yet
Lecture_3
48 pages
DLT Unit - 4
No ratings yet
DLT Unit - 4
36 pages
Unit-3
No ratings yet
Unit-3
59 pages
Module 3
No ratings yet
Module 3
67 pages
Deep Learning: Seungsang Oh
No ratings yet
Deep Learning: Seungsang Oh
39 pages
unit-3-CNN-2024
No ratings yet
unit-3-CNN-2024
58 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
9 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
Lecture 6
No ratings yet
Lecture 6
17 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
Cnn
No ratings yet
Cnn
98 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
8 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
UNIT III DEEP LEARNING
No ratings yet
UNIT III DEEP LEARNING
31 pages
Unit 3 CNN
No ratings yet
Unit 3 CNN
47 pages
Convolution Neural Networks
No ratings yet
Convolution Neural Networks
80 pages
Module-4 dl
No ratings yet
Module-4 dl
22 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
Intro to CNN
No ratings yet
Intro to CNN
93 pages
MODULE 5
No ratings yet
MODULE 5
20 pages
4a Convolutional Neural Networks
No ratings yet
4a Convolutional Neural Networks
56 pages
Convolutional Neural Networks in Python _ DataCamp
No ratings yet
Convolutional Neural Networks in Python _ DataCamp
22 pages
AD3501-DL-UNIT 2 NOTES
No ratings yet
AD3501-DL-UNIT 2 NOTES
29 pages
DL U4
No ratings yet
DL U4
59 pages
L11 Learning III Neural Network Architectures
No ratings yet
L11 Learning III Neural Network Architectures
35 pages
Convolutional Neural Networks Notes
No ratings yet
Convolutional Neural Networks Notes
29 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
55 pages
Understanding of Convolutional Neural Network (CNN)
No ratings yet
Understanding of Convolutional Neural Network (CNN)
9 pages
Scan 30 Sep 23 18 20 44
No ratings yet
Scan 30 Sep 23 18 20 44
30 pages
Ch VI _ Convolutional Neural Network_24
No ratings yet
Ch VI _ Convolutional Neural Network_24
33 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
35 pages
Unit III
No ratings yet
Unit III
8 pages
M4_IA2
No ratings yet
M4_IA2
6 pages
CNN notes unit-3
No ratings yet
CNN notes unit-3
12 pages
Introduction to Deep Learning
No ratings yet
Introduction to Deep Learning
47 pages
An Analysis of Convolutional Neural Network Architectures
No ratings yet
An Analysis of Convolutional Neural Network Architectures
54 pages
Ch3 CNN
No ratings yet
Ch3 CNN
64 pages
Unit III
No ratings yet
Unit III
60 pages
DL Unit 3 2019PAT
No ratings yet
DL Unit 3 2019PAT
66 pages
Deep LearningUNIT-IV
No ratings yet
Deep LearningUNIT-IV
16 pages
CV Unit V
No ratings yet
CV Unit V
18 pages
UNIT 2 Self Notes
No ratings yet
UNIT 2 Self Notes
10 pages
UNIT - 2
No ratings yet
UNIT - 2
31 pages
UNIT -4 DL
No ratings yet
UNIT -4 DL
19 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
The Girl From Ipanema
No ratings yet
The Girl From Ipanema
1 page
Ontology Brain
No ratings yet
Ontology Brain
20 pages
Alice in Teh Wonderland
No ratings yet
Alice in Teh Wonderland
1 page
Lecture Week2 2021 Cobot Basic Concepts
No ratings yet
Lecture Week2 2021 Cobot Basic Concepts
33 pages
Schneider - Ch16 - Inv To CS 8e
No ratings yet
Schneider - Ch16 - Inv To CS 8e
33 pages
Schneider - Ch13 - Inv To CS 8e
No ratings yet
Schneider - Ch13 - Inv To CS 8e
38 pages
Slides MC5
No ratings yet
Slides MC5
253 pages
GANS-ppt
No ratings yet
GANS-ppt
22 pages
Applsci 13 04550
No ratings yet
Applsci 13 04550
21 pages
Endsem ML All Pyq
No ratings yet
Endsem ML All Pyq
9 pages
Laporan M4
No ratings yet
Laporan M4
9 pages
Generative AI Course Outline
No ratings yet
Generative AI Course Outline
4 pages
Project 5 - Traffic Sign Classification Using LeNet
No ratings yet
Project 5 - Traffic Sign Classification Using LeNet
13 pages
Big Data Analytics Project
No ratings yet
Big Data Analytics Project
21 pages
1491-Article Text-3334-1-10-20201216
No ratings yet
1491-Article Text-3334-1-10-20201216
6 pages
2007 02 01b Janecek Perceptron
No ratings yet
2007 02 01b Janecek Perceptron
37 pages
6 Lecture CNN
No ratings yet
6 Lecture CNN
45 pages
Lecture-4 Multi-Layer Perceptrons
No ratings yet
Lecture-4 Multi-Layer Perceptrons
23 pages
Detecting Phishing Websites Using Machine Learning
No ratings yet
Detecting Phishing Websites Using Machine Learning
6 pages
BCS 465 Neural Network - 2020
No ratings yet
BCS 465 Neural Network - 2020
5 pages
Practical4 Soft
No ratings yet
Practical4 Soft
4 pages
01. Infographic_ABCs of AI and Deep Learning
No ratings yet
01. Infographic_ABCs of AI and Deep Learning
1 page
Ijcrt 196552
No ratings yet
Ijcrt 196552
6 pages
Model Klasifikasi Multi Class
No ratings yet
Model Klasifikasi Multi Class
28 pages
ANN-unit 4 PDF
No ratings yet
ANN-unit 4 PDF
23 pages
Memahami Deep Learning
100% (1)
Memahami Deep Learning
109 pages
Notes Unit 1-3 Part-I
No ratings yet
Notes Unit 1-3 Part-I
20 pages
Syllabus ANN
No ratings yet
Syllabus ANN
2 pages
Artificial Intelligence Mini Project
No ratings yet
Artificial Intelligence Mini Project
5 pages
Transfer Learning CNN
No ratings yet
Transfer Learning CNN
21 pages
Deep Learning UNIT 1&2
No ratings yet
Deep Learning UNIT 1&2
69 pages
Universal Approximation Theorem visualization
No ratings yet
Universal Approximation Theorem visualization
11 pages
Convolutional Networks For Images, Speech, and Time-Series: January 1995
No ratings yet
Convolutional Networks For Images, Speech, and Time-Series: January 1995
15 pages
Empirical Evaluation of Rectified Activations in ConvolutionNetwork
No ratings yet
Empirical Evaluation of Rectified Activations in ConvolutionNetwork
5 pages
NNFL CBCGS Syllabus
No ratings yet
NNFL CBCGS Syllabus
8 pages
CSE3008 Module3
No ratings yet
CSE3008 Module3
38 pages

L09-10 DL and CNN

Uploaded by

L09-10 DL and CNN

Uploaded by

•Lecturer: Paulo Santos

COMP2712 – NNML •[email protected]

Vanishing and Exploding Gradients

Vanishing and Exploding Gradients

Rectified Linear Unit

Deep neural networks + convolutions

CNN: Layer Architecture

• There are four main operations

CNN: Layer Architecture

pre-trained weights: https://keras.io/api/applications/

CNN: Pooling or Sub-sampling

• The ultimate goal is classification:

• Think of each element of a 2D array in the top row as a

Training Image Set Test Image Set

CNN: Intriguing properties

correct diff ostrich

• You have large amount of data with good quality

• You are modelling image/audio/language/time-series data

• You need a model that is less reliant on handmade features and

Non-trainable params are part of vgg16

learning_rate = 1e-5 Small learning rate

model.compile( Must re-compile the model for trainable to take affect

Vgg16 params now trainable!

You might also like