0% found this document useful (0 votes)
30 views43 pages

Deep Learning Unit2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views43 pages

Deep Learning Unit2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

DEEP LEARNING

UNIT2

By: DIVAKAR KESHRI


PhD NIT TRICHY
CONTENTS
 Introduction to:
 Shallow neural networks
 Deep neural networks
 Architecture Design
 Convolutional Neural Networks
 Introduction Convolution (1D and 2D)
 Pooling
 Training of network Hyper parameter tuning
 Pre-trained models: AlexNet,GoogleNet,Resnet
VGG-16,VGG-19,ImageNet
 Case study of CNN (Healthcare Agriculture
Stock Market Weather Forecasting.
SHALLOW NEURAL NETWORKS

 Shallow neural networks consist of only


1 or 2 hidden layers.

 A shallow neural network gives us an


insight into what exactly is going on
inside a deep neural network.

 The figure below shows a shallow neural


network with 1 hidden layer, 1 input
layer and 1 output layer.
SHALLOW NEURAL
NETWORKS
 A neuron can be thought of as a
combination of 2 parts:

 The first part computes the output Z, using


the inputs and the weights.

 The second part performs the activation on


Z to give out the final output A of the
neuron.
SHALLOW NEURAL
NETWORKS
 The whole neural network computes the
output for a given input X
 These can also be called the forward-
propagation equations.
DEEP NEURAL NETWORKS
 A deep neural network (DNN) is an ANN with
multiple hidden layers between the input and
output layers.

 Similar to shallow ANNs, DNNs can model


complex non-linear relationships.

 The main purpose of a neural network is to


receive a set of inputs, perform progressively
complex calculations on them, and give
output to solve real world problems like
classification.

 We have an input, an output, and a flow of


SHALLOW VS. DNN
ARCHITECTURE DESIGN
A GENERIC CNN ARCHITECTURE.
CONVOLUTIONAL NEURAL
NETWORKS
 A CNN passes an image through the
network layers and outputs a final class.
 The network can have tens or hundreds of
layers, with each layer learning to detect
different features.
 Filters are applied to each training image
at different resolutions, and the output of
each convolved image is used as the input
to the next layer.
 The filters can start as very simple
features, such as brightness and edges,
and increase in complexity to features that
uniquely define the object as the layers
COMMONLY USED NETWORK LAYERS
 Convolution puts the input images through a set of
convolutional filters, each of which activates certain
features from the images.

 Rectified linear unit (ReLU) allows for faster and more


effective training by mapping negative values to zero
and maintaining positive values.

 Pooling simplifies the output by performing nonlinear


down sampling, reducing the number of parameters that
the network needs to learn about.

 Fully connected layers “flatten” the network’s 2D spatial


features into a 1D vector that represents image-level
features for classification purposes.

 Softmax provides probabilities for each category in the


CONVOLUTION (1D AND 2D)
 1D CNN:
 The convolutional kernel/filter moves in
just one direction(say along time-axis) to
calculate the output.

 Output-shape is a 1D array.

 Use case: Signal smoothing, Sentence


Classification
CONVOLUTION (1D AND 2D)
 2D CNN:
 The convolutional kernel moves in 2-
direction (x,y) to calculate the
convolutional output.
 The output shape of the output is a 2D
Matrix.
 Use cases: Image Classification,
Generating New Images, Image
Inpainting, Image Colorization, etc.
1D VS. 2D CONVOLUTION
POOLING
 The pooling operation involves sliding a two-dimensional filter
over each channel of feature map and summarising the
features lying within the region covered by the filter.

 Pooling layers are used to reduce the dimensions of the


feature maps.

 Thus, it reduces the number of parameters to learn and the


amount of computation performed in the network.

 The pooling layer summarises the features present in a region


of the feature map generated by a convolution layer.

 So, further operations are performed on summarised features


instead of precisely positioned features generated by the
convolution layer.

 This makes the model more robust to variations in the


position of the features in the input image.
TYPES OF POOLING LAYERS:
 Max Pooling
 Max pooling is a pooling operation that
selects the maximum element from the
region of the feature map covered by
the filter.
 Thus, the output after max-pooling layer
would be a feature map containing the
most prominent features of the previous
feature map.
TYPES OF POOLING
LAYERS:
 Average Pooling
 Average pooling computes the average of
the elements present in the region of
feature map covered by the filter.
 Thus, while max pooling gives the most
prominent feature in a particular patch of
the feature map, average pooling gives the
average of features present in a patch.
TRAINING OF NETWORK
 In simple terms: Training a Neural Network means
finding the appropriate Weights of the Neural
Connections.

 Backpropagation is the most common training


algorithm for neural networks.

 It makes gradient descent feasible for multi-layer neural


networks.

 Fitting a neural network involves using a training


dataset to update the model weights to create a good
mapping of inputs to outputs.
 This training process is solved using an optimization
algorithm that searches through a space of possible
values for the neural network model weights for a set of
weights that results in good performance on the training
HYPER PARAMETER TUNING
 Hyperparameters, that cannot be directly
learned from the regular training process.
 They are usually fixed before the actual training
process begins.
 These parameters express important properties
of the model such as its complexity or how fast it
should learn.
 Some examples of model hyperparameters
include:
 The penalty in Logistic Regression Classifier i.e.
L1 or L2 regularization
 The learning rate for training a neural network.
 The C and sigma hyperparameters for support
vector machines.
 The k in k-nearest neighbors.
HYPER PARAMETER TUNING
 The two best strategies for Hyperparameter
tuning are:

 GridSearchCV & RandomizedSearchCV

 In GridSearchCV approach, the machine


learning model is evaluated for a range of
hyperparameter values.

 As in the image, for C = [0.1, 0.2, 0.3, 0.4,


0.5] and Alpha = [0.1, 0.2, 0.3, 0.4]. For a
combination of C=0.3 and Alpha=0.2, the
performance score comes out to be
GRIDSEARCHCV
RANDOMIZEDSEARCHCV
 RandomizedSearchCV solves the
drawbacks of GridSearchCV, as it goes
through only a fixed number of
hyperparameter settings.

 It moves within the grid in a random


fashion to find the best set of
hyperparameters.

 This approach reduces unnecessary


computation.
DROPOUT
DROPOUT
BATCH NORMALIZATION
BATCH NORMALIZATION
PRE-TRAINED MODELS
PRE-TRAINED MODELS
ALEXNET
 The AlexNet contains 8 layers with weights.
 5 convolutional layers.
 3 fully connected layers.
 At the end of each layer, ReLu activation is performed except
for the last one.
 which outputs with a softmax with a distribution over the
1000 class labels.
 Dropout is applied in the first two fully connected layers.
 As the figure above shows also applies Max-pooling after the
first, second, and fifth convolutional layers.
 The kernels of the second, fourth, and fifth convolutional
layers are connected only to those kernel maps in the
previous layer, which reside on the same GPU.
 The kernels of the third convolutional layer are connected to
all kernel maps in the second layer.
 The neurons in the fully connected layers are connected to all
neurons in the previous layer.
RELU ACTIVATION
FUNCTION
ALEXNET
ALEXNET
ALEXNET
VGG16 VS. VGG19
VGG16 VS.VGG19
 VGG16 and VGG19 are both convolutional neural
networks (CNNs) that were developed by the Visual
Geometry Group (VGG) at the University of Oxford.
 Both networks are trained for image classification
tasks and are widely used as a base model for
transfer learning.
 The main difference between VGG16 and VGG19 is
the number of layers in the network. VGG16 is a 16-
layer CNN, while VGG19 is a 19-layer CNN.
 Both networks have a similar architecture, with
multiple convolutional and max pooling layers, and
fully connected layers at the end of the network.
VGG16 VS.VGG19
 VGG16 and VGG19 use small convolutional filters
(3x3) with a stride of 1 and are stacked multiple times
with max pooling layers in between to reduce the
spatial dimension.
 The number of filters in each convolutional layer is
increased as the network progresses, allowing the
network to learn more complex features from the input
image.
 The main advantage of VGG19 over VGG16 is that it
has more layers, which enables it to learn more
complex representations of the data.
 VGG19 is more accurate than VGG16, but also heavier
and requires more memory and computational
GOOGLENET
INCEPTION MODULE
AUXILIARY CLASSIFIER
GOOGLENET
 GoogleNet, also known as Inception v1, is a deep
convolutional neural network architecture
developed by Google researchers in 2014.
 It was designed to achieve high accuracy on the
ImageNet Large Scale Visual Recognition Challenge
(ILSVRC) dataset, which consists of over 1 million
images in 1,000 categories.
 GoogleNet introduced a novel architecture called
the Inception module, which uses a combination of
1x1, 3x3, and 5x5 convolutions, as well as pooling
operations, to extract features at different scales.
 By using these different convolutional filters in
parallel, the network is able to capture both local
and global features, while keeping the number of
parameters low.
GOOGLENET
 The GoogleNet architecture consists of 22 layers,
with a total of 5 million parameters.
 It also includes auxiliary classifiers at
intermediate layers to encourage gradient flow
during training and reduce overfitting.
 The network uses ReLU activation functions and
batch normalization to speed up training and
improve performance.
 GoogleNet achieved state-of-the-art results on
the ILSVRC dataset in 2014, with a top-5 error
rate of 6.7%.
 It has since been surpassed by newer
architectures, but remains an important
contribution to the field of computer vision and
RESNET

ResNet-34
RESNET
 ResNet, short for Residual Network, is a deep
convolutional neural network architecture that was
introduced by Microsoft researchers in 2015.
 It was designed to address the problem of vanishing
gradients in very deep networks by introducing skip
connections, or residual connections, that allow
gradients to be directly propagated from earlier layers
to later layers.
 The basic building block of the ResNet architecture is
the residual block, which consists of two or three
convolutional layers and a skip connection.
 The skip connection adds the input of the block to its
output, allowing the network to learn residual functions
rather than complete mappings.
 By doing so, the network is able to preserve information
from earlier layers and avoid the degradation of
accuracy that can occur in very deep networks.
RESNET
 ResNet comes in various depths, with ResNet-18,
ResNet-34, ResNet-50, ResNet-101, and ResNet-
152 being the most common variants. The
numbers in the names correspond to the number
of layers in the network. ResNet-50, for example,
has 50 layers, including 48 convolutional layers
and 2 fully connected layers.
 ResNet has been widely used in computer vision
applications, and it has achieved state-of-the-art
results on various benchmarks, including the
ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) dataset.
 Its success has inspired the development of
other architectures that use skip connections,
such as DenseNet and Highway Networks.

You might also like