Neural Networks
Neural Networks
Networks
• Idea behind artifi cial neural network is to mimic human brain
• T h i s c o n n e c t i v i t y h e l p t h e m fl o w i n f o r m a t i o n
a n d d a t a , a n d p r o c e s s i t a n d g e n e ra t e s
output.
• T h e h u m a n b ra i n c a n l e a r n f r o m e x p e r i e n c e
a n d t ra i n d a t a i t s e l f.
• T h e c a p a c i t y o f l e a r n i n g a n d t ra i n i n g m a ke
h u m a n b ra i n a ve r y i m p o r t a n t o r g a n
Neural Network
Neuron
We want to build such
machines in artifi cial neural
network which works in
same manner as human
brain.
Artifi cial Neural Network
• inspired by the biological neurons within the human body which activate
under certain circumstances resulting in a related action performed by the
body in response.
• Like traditional machine learning algorithms, here too, there are certain
values that neural network learn in the training phase.
Artifi cial Neural Network
• Neural n etworks are just one of many too ls an d approaches used in machine
learning algorithms. Th e neural network itself may b e u sed as a p iece in many
different machin e learning alg orithms to pro cess complex data inp uts into a sp ace
th at computers can understand .
• Neural n etworks are bein g applied to many real-life problems today, includin g
speech and image recogn ition, spam email filtering, finance, and medical diagnosis,
to name a few.
Components of Neural Network
Weights are numeric values that are multiplied by inputs. In backpropagation, they are
modified to reduce the loss. In simple words, weights are machine learned values from
Neural Networks. They self-adjust depending on the difference between predicted outputs
vs training inputs.
Activation Function is a mathematical formula that helps the neuron to switch ON/OFF.
•Hidden layer represents the intermediary nodes that divide the input space into regions
with (soft) boundaries. It takes in a set of weighted input and produces output through an
activation function.
Wi t h o u t N e u r a l N e t w o r k : L e t ' s h a v e a l o o k a t t h e e x a m p l e g i v e n b e l o w. H e r e w e h a v e a
machine, such that we have trained it with four types of cats, as you can see in the image
b e l o w. A n d o n c e w e a r e d o n e w i t h t h e t r a i n i n g , w e w i l l p r o v i d e a r a n d o m i m a g e t o t h a t
particular machine that has a cat. Since this cat is not similar to the cats through which we have
trained our system and the background is also changed, so without the neural network, our
m a c h i n e w o u l d n o t i d e n t i f y t h e c a t i n t h e p i c t u r e . B a s i c a l l y, t h e m a c h i n e w i l l g e t c o n f u s e d i n
figuring out where the cat is.
Wi t h N e u r a l N e t w o r k : H o w e v e r, w h e n w e t a l k a b o u t t h e c a s e w i t h a n e u r a l n e t w o r k , e v e n i f w e
have not trained our machine with that particular cat. But still, it can identify certain features
of a cat that we have trained on, and it can match those features with the cat that is there in that
particular image and can also identify the cat. So, with the help of this example, you can
clearly see the importance of the concept of a neural network.
Working of Artifi cial Neural Network
• There are various activation functions available as per the nature of input
values. Once the output is generated from the final neural net layer, loss
function (input vs output)is calculated, and backpropagation is performed
where the weights are adjusted to make the loss minimum. Finding
optimal values of weights is what the overall operation focuses around.
Working of Artifi cial Neural Network
Perceptron
Working of Artificial Neural Networks
In stead of directly getting into the working of Artificial Neural Networks, lets
b reak d own and try to understand Neural Network's basic unit, which is called
a Perceptro n .
So , a perceptron can be defined as a neural n etwork with a single layer that classifies
the lin ear d ata. It further constitutes four major components, which are as follows;
1.In p uts
The inputs (x) are fed into the input layer, which undergoes multiplication
with the allotted weights (w) followed by experiencing addition in order to
form weighted sums. Then these inputs weighted sums with their
corresponding weights are executed on the pertinent activation function.
Weights and Bias
As and when the input variable is fed into the network, a random value is given as a
weight of that particular input, such that each individual weight represents the
importance of that input in order to make correct predictions of the result.
Summation Function
After the weights are assigned to the input, it then computes the product of each
input and weights. Then the weighted sum is calculated by the summation function in
which all of the products are added.
Activation Function
•S i n c e t h e n o n - l i n e a r f u n c t i o n c o m e s u p w i t h d e r i v a t i v e
functions, so the problems related to backpropagation has
been successfully solved.
•F o r t h e c r e a t i o n o f d e e p n e u r a l n e t w o r k s , i t p e r m i t s t h e
stacking up of several layers of the neurons.
1. Sigmoid or Logistic
Activation Function
•Formula:
•Range: (0, 1)
Th e t an h act i v at i o n f u n ct i o n wo r k s mu ch b et t er t h an t h at
o f t h e s i g mo i d f u n ct i o n , o r s i mp l y we can s ay i t i s an
ad v an ced v er s i o n o f t h e s i g mo i d act i v at i o n f u n ct i o n . S i n ce
i t h as a v al u e r an g e b et ween - 1 t o 1 , s o i t i s u t i l i zed b y t h e
h i d d en l ay er s i n t h e n eu r al n et wo r k , an d b ecau s e o f t h i s
r eas o n , i t h as mad e t h e p r o ces s o f l ear n i n g mu ch eas i er.
3. ReLU(Rectified Linear
Unit) Activation Function
X1=0.05
X2=0.10
Initial weight
W1=0.15 w5=0.40
W2=0.20 w6=0.45
W3=0.25 w7=0.50
W4=0.30 w8=0.55
Bias Values
b1=0.35 b2=0.60
Target Values
T1=0.01
T2=0.99
Forward Pass
H1=x1×w1+x2×w2+b1
H1=0.05×0.15+0.10×0.20+0.35
H1=0.3775
To find the value of y1, we first multiply the input value i.e., the outcome of H1
and H2 from the weights as
Back prop agation is one of the important concepts of a neural network. Our task is
to classify our data best. For this, we have to update the weights of parameter and
bias, but how can we do that in a deep neural network? In the linear regression
model, we use gradient descent to optimize the parameter. Similarly here we also use
gradient descent algorithm using Backpropagation.
F or a single training example, Back prop agation algorithm calculates the gradient of
the error fu nction . Backpropagation can be written as a function of the neural
network. Backpropagation algorithms are a set of methods used to efficiently train
artificial neural networks following a gradient descent approach which exploits the
chain rule.
The main features of Backpropagation are the iterative, recursive and
efficient method through which it calculates the updated weight to improve
the network until it is not able to perform the task for which it is being
trained. Derivatives of the activation function to be known at network design
time is required to Backpropagation.
When training a neural network, the cost value J quantifies the network’s error, i.e.,
its output’s deviation from the ground truth. We calculate it as the average error over
all the objects in the training set, and our goal is to minimize it.
For example, let’s say we have a network that classifies animals either as cats or
dogs. It has two output neurons and , where the former represents the probability
that the animal is a cat and the latter that it’s a dog. Given an image of a cat, we
expect and .
However, if the network outputs and , we can quantify our error on that image as the
squared distance:
1
𝐽 = ∑ ( 𝑦𝑖 − ^
2
𝑦𝑖)
𝑛 𝑖
We use the cost to update the weights and biases so that the actual outputs
get as close as possible to the desired values. To decide whether to increase
or decrease a coefficient, we calculate its partial derivative using
backpropagation. Let’s explain it with an example.
When training a neural network, the cost value J quantifi es the
network’s error, i.e., its output’s deviation from the ground
truth. We calculate it as the average error over all the objects in
the training set, and our goal is to minimize it.
• Gradient descent
• Adagrad
• Momentum
• Adam
• Ftrl
• RMSprop, etc
Types of Neural Network
• MLPs models are the most basic deep neural network, which is composed of a
series of fully connected layers.
• Today, MLP machine learning methods can be used to overcome the requirement
of high computing power required by modern deep learning architectures.
• Each new layer is a set of nonlinear functions of a weighted sum of all outputs
(fully connected) from the prior one.
Convolutional Neural Network (CNN)
• CNNs are most commonly employed in computer vision. Given a series of images
or videos from the real world, with the utilization of CNN, the AI system learns
to automatically extract the features of these inputs to complete a specific task,
e.g., image classification, face authentication, and image semantic segmentation.
• Different from fully connected layers in MLPs, in CNN models, one or multiple
convolution layers extract the simple features from input by executing
convolution operations. Each layer is a set of nonlinear functions of weighted
sums at different coordinates of spatially nearby subsets of outputs from the prior
layer, which allows the weights to be reused.
It is a specialized type of neural network that can learn spatial hierarchies of
features directly from pixel values by using filters that scan over the input
image and extract relevant features.
During training, the CNN uses backpropagation to adjust the weights and
biases of the network to optimize its performance. This process involves
propagating the error back through the network and updating the parameters
of the network to minimize the error.
CNNs have been used to achieve state-of-the-art performance in a variety of
image recognition and classification tasks, including object detection, face
recognition, and image segmentation. They are also used in natural language
processing for tasks such as text classification and sentiment analysis.
CNNs are a powerful tool for image analysis and recognition, and they have
shown great potential in many applications. Their ability to learn spatial
hierarchies of features directly from pixel values makes them well-suited for
a wide range of image-related tasks.
Convolutional Layers:
•They apply convolution operations to the input data using learnable filters
or kernels.
•It applies a set of filters to the input image, each of which extracts a
specific feature, such as edges, textures, or shapes.
•Pooling helps decrease the computational load and make the network more
robust to variations in the input data.
•This helps to reduce overfitting and improve the efficiency of the network.
Fully Connected Layers:
•Fully connected layers are traditional neural network layers where each
neuron is connected to every neuron in the previous and subsequent layers.
•They are typically used in the later stages of the network to make
predictions based on the learned features.
• AlexNet. For image classification, as the first CNN neural network to win the ImageNet
Challenge in 2012, AlexNet consists of five convolution layers and three fully connected
layers. Thus, AlexNet requires 61 million weights and 724 million MACs (multiply-add
computation) to classify the image with a size of 227×227.
• R e s N e t. R e sN e t, th e sta te - o f- th e -a r t e ff o r t, u s e s th e “ sh o r tc u t” str u c tu re to re a c h a
h u m a n -le v e l a c c u r a c y w ith a to p -5 e r ro r r a te b e lo w 5 % . In a d d itio n , th e “ sh o r tc u t”
m o d u le is u se d to so lv e th e g r a d ie n t v a n ish in g p r o b le m d u r in g th e tr a in in g p ro c e ss ,
m a k in g it p o ssib le to tr a in a D N N m o d e l w ith a d e e p e r stru c tu r e . T h e p e r fo rm a n c e o f
p o p u la r C N N s a p p lie d f o r A I v isio n ta sk s g r a d u a lly in c r e a se d o v e r th e y e a rs,
s u rp a s s in g h u m a n v is io n (5 % e rr o r r a te in th e c h a rt b e lo w ) .
Recurrent Neural Network (RNN)
• The input of RNN consists of the current input and the previous samples.
Therefore, the connections between nodes form a directed graph along a
temporal sequence. Furthermore, each neuron in an RNN owns an internal
memory that keeps the information of the computation from the previous
samples.
• RNN models are widely used in Natural Language Processing (NLP) due to the
superiority of processing the data with an input length that is not fixed. The task
of the AI here is to build a system that can comprehend natural language spoken
by humans, e.g., natural language modeling, word embedding, and machine
translation.