19ImageClassification
19ImageClassification
• https://deeplearning.web.unc.edu/files/2016/12/An-overview-of-gradient-descent-optimization-algorithm.pdf
• https://www.slideshare.net/infobuzz/back-propagation
• http://people.uncw.edu/tagliarinig/Courses/415/Lectures/An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
• https://hackernoon.com/visualizing-parts-of-convolutional-neural-networks-using-keras-and-cats-5cc01b214e59
• https://cs231n.github.io
• Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing
systems. 2012.
• https://medium.com/dbrs-innovation-labs/visualizing-neural-networks-in-virtual-space-7e3f62f7177
• https://www.kdnuggets.com/2016/06/visual-explanation-backpropagation-algorithm-neural-networks.html
• https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/http://www.emergentmind.com/neural-network
• https://en.wikipedia.org/wiki/Convolutional_neural_network
Paper
• Name: "Imagenet classification with deep convolutional neural networks."
Advances in neural information processing systems. 2012
• Authors : Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton
• Conference: ILSVRC(ImageNet Large Scale Visual Recognition
Competition)-2012
OVERVIEW
www.tenserflow.com
CHALLENGES
• Illumination:
• Deformation:
http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture2.pdf
CHALLENGES
• Occlusion:
• Background Clutter:
http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture2.pdf
INITIAL ATTEMPTS
Detect edges
• Compute explicit “Rules” based on corners and boundaries and identify Labels based on
these rules.
ex: Two lines meeting at a corner are a cat’s ears.
Pitfalls
• Time consuming, since we have start all over for an other object label.
Classifier
training data test data
output
https://www.cs.toronto.edu/~kriz/cifar.html
K-NEAREST NEIGHBORS
• Use a distance metric ex:L1 or L2 distance and compute the K-nearest neighbors
i.e. K “trained” images having least difference of the distance metric from the
chosen image.
• A majority vote is taken among the K neighbors and that is selected as the label
of the test image.
http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture2.pdf
K-NEAREST NEIGHBORS
• Curse of dimensionality.
http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture2.pdf
LINEAR CLASSIFICATION
A linear classifier is of the form
f(x,W) = Wx + b
x – Input vector {x1,x2,..xn} where xi is the value of a pixel
dimension
W – set of weights assigned to each pixel dimension
determined by the training data for each label.
b – bias for each label.
f(x,W) – vector of scores corresponding to each label
f(x,W)
https://www.pyimagesearch.com/2016/08/22/an-intro-to-linear-classification-with-python/
INTERPRETING A LINEAR CLASSIFIER
• The linear classifier puts in the linear decision boundaries separating one category
from the rest of the categories.
http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture2.pdf
AN EXAMPLE
Column
Vector
Dog
score
+ = Cat
score
Ship
score
W x b
http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture2.pdf
LOSS FUNCTIONS
• Loss functions for classification are computationally feasible loss functions representing
the price paid for inaccuracy of predictions in classification problems (problems of
identifying which category a particular observation belongs to).
• It describes how far off the result your network produced is from the expected result - it
indicates the magnitude of error your model made on its prediction.
Source: https://en.wikipedia.org/wiki/Loss_functions_for_classification
https://stackoverflow.com/questions/42877989/what-is-a-loss-function-in-simple-words
A loss function tells how good
are: our current classifier is
Source: https://www.pinterest.com/pin/34973272730414755;https://www.freepik.com/free-photo/car-in-
glossy-red_758995.htm#term=convert&page=1&position=4;https://study.com/academy/lesson/what-is-a-
natural-habitat-definition-habitat-destruction-quiz.html
are:
Source: https://www.pinterest.com/pin/349732727304147554
https://www.freepik.com/free-photo/car-in-glossy-
red_758995.htm#term=convert&page=1&position=42
Multiclass SVM loss:
are:
cat 3.2 1.3 2.2 the SVM loss has the form:
Source: https://www.pinterest.com/pin/349732727304147554
https://www.freepik.com/free-photo/car-in-glossy-
red_758995.htm#term=convert&page=1&position=42
are:
Source: http://cs231n.stanford.edu/
2017/
OPTIMIZATION
• Optimization Algorithms are used to update weights and biases i.e. the internal
parameters of a model to reduce the error.
Source: https://medium.com/data-science-group-iitr/loss-functions-and-optimization-algorithms-demystified-
bb92daff331cs
Gradient Descent
• Gradient descent is a way to minimize an objective
function J(w) parameterized by a model's parameters
w
Source: https://deeplearning.web.unc.edu/files/2016/12/An-overview-of-gradient-descent-optimization-
algorithm.pdf
https://giphy.com/gifs/gradient-O9rcZVmRcEGqI
Gradient Descent
Vanilla Gradient Descent Algorithm:
Source: https://machinelearningmastery.com/gradient-descent-for-machine-learning/
NEURAL NETWORKS AND
BACKPROPAGATION ALGORITHM
INTRODUCTION
Source: https://www.slideshare.net/infobuzz/back-propagation
https://en.wikipedia.org/wiki/Backpropagation
INTUITION
• As the algorithm's name implies, the errors (and
therefore the learning) propagate backwards from
the output nodes to the inner nodes.
• So technically speaking, backpropagation is used
to calculate the gradient of the error of the network
with respect to the network's modifiable weights.
• This gradient is almost always then used in a
simple stochastic gradient descent algorithm to find
weights that minimize the error. “
Source: https://www.slideshare.net/infobuzz/back-propagation
https://en.wikipedia.org/wiki/Backpropagation
BASIC NEURON MODEL - FEEDFORWARD
NETWORK
Source: http://people.uncw.edu/tagliarinig/Courses/415/Lectures/
An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
INPUTS TO NEURONS
• Arise from other neurons or from outside the network
• Nodes whose inputs arise outside the network are called input nodes and simply
copy values
• An input may excite or inhibit the response of the neuron to which it is applied,
depending upon the weight of the connection
Source: http://people.uncw.edu/tagliarinig/Courses/415/Lectures/
An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
Weights
• Normally, positive weights are considered as excitatory while negative weights are
thought of as inhibitory
• Learning is the process of modifying the weights in order to produce a network that
performs some function
Output
The response function is normally nonlinear
Samples include
• Sigmoid
• Piecewise linear
Source: http://people.uncw.edu/tagliarinig/Courses/415/Lectures/
An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
BACKPROPAGATION PREPARATION
• Training Set
A collection of input-output patterns that are used to train the network
• Testing Set
A collection of input-output patterns that are used to assess network performance
• Learning Rate-α
A scalar parameter, analogous to step size in numerical integration, used to set
the rate of adjustments
Source:http://people.uncw.edu/tagliarinig/Courses/415/Lectures/
An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
NETWORK ERROR
• Total-Sum-Squared-Error (TSSE)
• Root-Mean-Squared-Error (RMSE)
Source:http://people.uncw.edu/tagliarinig/Courses/415/Lectures/
An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
A PSEUDO-CODE ALGORITHM
• Randomly choose the initial weights
• While error is too large
• For each training pattern (presented in random order)
• Apply the inputs to the network
• Calculate the output for every neuron from the input layer, through the hidden
layer(s), to the output layer
• Calculate the error at the outputs
• Use the output error to compute error signals for pre-output layers
• Use the error signals to compute weight adjustments
• Apply the weight adjustments
• Periodically evaluate the network performance
Source:http://people.uncw.edu/tagliarinig/Courses/415/Lectures/
An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
APPLY INPUTS FROM A PATTERN
Feedforward
Outputs
Inputs
• Input nodes compute only the identity
function
Source http://people.uncw.edu/tagliarinig/Courses/415/Lectures/
An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
CALCULATE OUTPUTS FOR EACH NEURON BASED
ON THE PATTERN
Outputs
Inputs
and
Source:
http://people.uncw.edu/tagliarinig/Courses/415/Lectures/An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
CALCULATE THE ERROR SIGNAL FOR EACH
HIDDEN NEURON
• The hidden neuron error signal δpj is given by
where δpk is the error signal of a post-synaptic neuron k and Wkj is the weight of
the connection from hidden neuron j to the post-synaptic neuron k
Source: http://people.uncw.edu/tagliarinig/Courses/415/Lectures/
An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
CALCULATE AND APPLY WEIGHT
ADJUSTMENTS
Source: http://people.uncw.edu/tagliarinig/Courses/415/Lectures/
An%20Introduction%20To%20The%20Backpropagation%20Algorithm.ppt
https://giphy.com/gifs/neural-networks-4LiMmbAcvgTQs
MERITS AND DEMERITS OF
BACKPROPAGATION
MERITS DEMERITS
Source: https://www.slideshare.net/infobuzz/back-propagation
https://en.wikipedia.org/wiki/Backpropagation
CONVOLUTIONAL NEURAL NETWORK
WHY CNN?
• ConvNets are powerful due to their ability to extract the core features of an image
and use these features to identify images that contain features like them.
• Even in a two layer CNN we can start to see the network paying a lot of attention
to regions like the whiskers, nose, and eyes of the cat.
• These are the types of features that would allow the CNN to differentiate a cat
from a bird for example.
Source:
https://hackernoon.com/visualizing-parts-of-convolutional-neural-networks-using-keras-and-cats-5cc01b214e59
NEURAL NETWORK
Source:
https://cs231n.github.io
CNN ARCHITECTURE
Source: https://medium.com/dbrs-innovation-labs/visualizing-neural-networks-in-virtual-space-7e3f62f7177
CNN LAYERS
• Convolutional layers
• Activation layers
• Pooling layers
• Fully Connected Layer
CONVOLUTIONAL LAYER
• The convolutional layer is the core building block of a CNN.
• The CONV layer’s parameters consist of a set of learnable filters (Kernel).
• Conv layer maintains the structural aspect of the image
• As we move over an image we effectively check for patterns in that section of the
image.
• When training an image, these filter weights change, and so when it is time to
evaluate an image, these weights return high values if it thinks it is seeing a
pattern it has seen before.
• The combinations of high weights from various filters let the network predict the
content of an image.
Source:https://en.wikipedia.org/wiki/Convolutional_neural_network
https://cs231n.github.io
https://medium.com/@Aj.Cheng/convolutional-neural-network-d9f69e473feb
CONVOLUTED IMAGE
Source: https://cs231n.github.io
CONVOLUTION
Source: https://cs231n.github.io
CONVOLUTION EXAMPLE
1 1 1 0 1
0 1 1 1 0 1 0 1
0 0 1 1 1 0 1 0
0 0 1 1 0 1 0 1
0 1 1 0 0
Convolved Feature
Image
CONVOLUTION EXAMPLE
Source: https://medium.com/dbrs-innovation-labs/visualizing-neural-networks-in-virtual-space-7e3f62f7177
CONVOLUTION(IMPORTANT TERMINOLOGY)
• Stride: The distance the window moves each time.
• Kernel: The “window” that moves over the image.
• Depth: Depth of the output volume is a hyperparameter. It corresponds to the
number of filters we would like to use, each learning to look for something
different in the input.
• Zero-padding: Hyperparameter. We will use it to exactly preserve the spatial size
of the input volume so the input and output width and height are the same
MULTIPLE FILTERS
Source: https://cs231n.github.io
SIMPLE FULLY CONNECTED NN VS CNN
Source: https://cs231n.github.io
CNN LAYERS
• Convolutional layers
• Activation layers
• Pooling layers
• Fully Connected Layer
CNN ARCHITECTURE
Source: https://medium.com/dbrs-innovation-labs/visualizing-neural-networks-in-virtual-space-7e3f62f7177
ACTIVATION LAYER
• The purpose of the Activation Layer is to squash the value of the Convolution
Layer into a range, usually [0,1]
• This layer increases the nonlinear properties of the model and the overall network
without affecting the receptive fields of the convolution layer.
• Examples: tanh, sigmoid, ReLu
Source: https://cs231n.github.io
DIFFERENT ACTIVATION FUNCTIONS
Source: https://hackernoon.com/visualizing-parts-of-convolutional-neural-networks-using-keras-and-cats-5cc01b214e59
MULTIPLE LAYERS OF CNN AND RELU
Source: https://cs231n.github.io
CNN LAYERS
• Convolutional layers
• Activation layers
• Pooling layers
• Fully Connected Layer
CNN ARCHITECTURE
Source: https://medium.com/dbrs-innovation-labs/visualizing-neural-networks-in-virtual-space-7e3f62f7177
POOLING LAYER
• Pooling Layer’s function is to progressively reduce the spatial size of the
representation to reduce the amount of parameters and computation in the
network, and hence to also control overfitting.
• Max pooling and Average pooling are the most common pooling functions. Max
pooling takes the largest value from the window of the image currently covered by
the kernel, while average pooling takes the average of all values in the window.
POOLING LAYER(MAX POOL)
Source: https://sefiks.com/2017/11/03/a-gentle-introduction-to-convolutional-neural-networks/
POOLING LAYER(GRAPHICAL
REPRESENTATION)
Source:https://ithelp.ithome.com.tw/articles/10187424
SUMMARY OF CNN LAYERS
• Convolutional layers multiply kernel value by the image window and optimize
the kernel weights over time using gradient descent
• Pooling layers describe a window of an image using a single value which is the
max or the average of that window(Max Pool vs Average Pool)
• Activation layers squash the values into a range, typically [0,1] or [-1,1].
• Fully Connected Layer Neurons have full connections to all activations in the
previous layer, as seen in regular Neural Networks. Their activations can hence
be computed with a matrix multiplication followed by a bias offset.
Source: https://cs231n.github.io
DEMO
• https://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html
IMAGENET CLASSIFICATION WITH DEEP CONVOLUTIONAL
NEURAL NETWORKS
Source: Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks."
Advances in neural information processing systems. 2012.
Outline
● Goal
● Dataset
● Architecture
● Overfitting
● Reducing Overfitting
● Results
ILSVRC: ImageNet Large Scale Visual Recognition Competition
Source: http://vision.stanford.edu/teaching/cs231b_spring1415/slides/alexnet_tugce_kyunghee.pdf
Goal
Image Source: Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks."
Advances in neural information processing systems. 2012.
THE ARCHITECTURE
Source: Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks."
Advances in neural information processing systems. 2012.
THE ARCHITECTURE
Source: Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks."
Advances in neural information processing systems. 2012.
Training on Multiple GPUs