0% found this document useful (0 votes)
12 views44 pages

4 Neural Networks

Neural network

Uploaded by

sachinch01432
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views44 pages

4 Neural Networks

Neural network

Uploaded by

sachinch01432
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

COMPILED BY: ER.

SANTOSH PANDEYA 1
Introduction:

 A Neural Network is a type of machine learning model that is inspired by the structure
and function of the human brain.
 It is made up of interconnected “neurons” which are modeled after biological neurons in
the brain.
 A neural network is a machine learning program, or model, that makes decisions in a
manner similar to the human brain, by using processes that mimic the way biological
neurons work together to identify phenomena, weigh options and arrive at conclusions.
 Used for wide variety of tasks in AI including image & speech recognition , NLP and
Control system.
 They are trend using a large dataset, & the training process involves adjusting the
weights & biases of the neurons to minimize the error between the networks outputs
and the true outputs.
 Used in deep learning
 Are capable of handling non-linearity and complexity in the data.

COMPILED BY: ER. SANTOSH PANDEYA 2


Biological Neural Networks

 Each neuron has a very simple structure, but an


army of such elements constitutes a tremendous
processing power.
Neuron: fundamental functional unit of all nervous
system tissue
• Soma: cell body, contain nucleus
• Dendrites: a number of fibres, input
• Axon: single long fibre with many branches,
output
• Synapse: junction of dendrites and axon, each
neuron form synapse with 10 to 100000 other
neurons
4

COMPILED BY: ER. SANTOSH PANDEYA 3


HOW DOES BRAIN WORKS

Signals are propagated from neuron to neuron by electrochemical reaction:


1.Chemical substances are released from the synapses and enter the dendrite,
raising or lowering the electrical potential of the cell body
2.When a potential reaches a threshold, an electrical pulse or action potential
is sent down the axon
3.The pulse spreads out along the branches of the axon, eventually reaching
synapses and releasing transmitters into the bodies of other cells
1.Excitory Synapses: Increase potential
2.Inhibitory Synapses : Decreases potential

COMPILED BY: ER. SANTOSH PANDEYA 4


Relationship between Biological neural network and artificial neural network

Biological Neural Network Artificial Neural Network

Dendrites Inputs
Cell :nucleus Nodes
Synapse Weights
Axon Output

COMPILED BY: ER. SANTOSH PANDEYA 5


The architecture of an artificial neural network:
To understand the concept of the architecture of an artificial neural network, we have to understand what a neural
network consists of. In order to define a neural network that consists of a large number of artificial neurons, which are
termed units arranged in a sequence of layers. Lets us look at various types of layers available in an artificial neural
network.

Artificial Neural Network primarily consists of three layers:

COMPILED BY: ER. SANTOSH PANDEYA 6


Input Layer:
 As the name suggests, it accepts inputs in several different formats provided by the programmer.

Hidden Layer:
 The hidden layer presents in-between input and output layers. It performs all the calculations to find hidden
features and patterns.

Output Layer:
 The input goes through a series of transformations using the hidden layer, which finally results in output that
is conveyed using this layer.
 The artificial neural network takes input and computes the weighted sum of the inputs and includes a bias.
This computation is represented in the form of a transfer function.

COMPILED BY: ER. SANTOSH PANDEYA 7


 It determines weighted total is passed as an input to an activation function to produce the output.
 Activation functions choose whether a node should fire or not. Only those who are fired make it to the output
layer.
 There are distinctive activation functions available that can be applied upon the sort of task we are performing.

COMPILED BY: ER. SANTOSH PANDEYA 8


Artificial Neural Networks (ANN)

• Consists of a number of very simple and highly interconnected processors called neurons.
• The neurons are connected by weighted links passing signals from one neuron to another.

• The output signal is transmitted through the neuron’s outgoing connection.


• The outgoing connection splits into a number of branches that transmit the same signal.
• The outgoing ranches terminate at the incoming connections of other neurons in the network

Characteristics of ANN

• Adaptive learning
• Self organization
• Error tolerance
• Real time operation
• Parallel information processing

COMPILED BY: ER. SANTOSH PANDEYA 9


COMPILED BY: ER. SANTOSH PANDEYA 10
Learning in ANN

Supervised learning

• Uses a set of inputs for which the desired outputs are known
• Example: Back propagation algorithm

Unsupervised learning

• Uses a set of inputs for which no desired output are known.


• The system is self organizing; that is, it organizes itself internally.
• A human must examine the final categories to assign meaning and determine the usefulness of
• the results.

Example: Self organizing map

COMPILED BY: ER. SANTOSH PANDEYA 11


The Neuron as a simple computing element: Diagram of a neuron

The neuron computes the weighted sum of


the input signals and compares the result
with a threshold value , θ. If the net input
is less than the threshold, the neuron
output is 1. But if the net input is greater
than or equal to the threshold, the neuron
becomes activated and its output attains a
value +1.

The neuron uses the following transfer or


activation function

𝑋=σ 𝑖=1𝑛𝑥𝑖𝑤𝑖𝑌=+1𝑖𝑓𝑋≥𝜃−1𝑖𝑓𝑋≤𝜃

This type of activation function is called a


sign function

COMPILED BY: ER. SANTOSH PANDEYA 12


Activation Functions for Neuron

COMPILED BY: ER. SANTOSH PANDEYA 13


Perceptron
• Perceptron is one of the simplest Artificial neural network architectures.
• It was introduced by Frank Rosenblatt in 1957s.
• It is the simplest type of feedforward neural network, consisting of a single layer of input nodes that are
fully connected to a layer of output nodes.
• It can learn the linearly separable patterns.
• it uses slightly different types of artificial neurons known as threshold logic units (TLU).
• it was first introduced by McCulloch and Walter Pitts in the 1940s.
• The operation of Rosenblatt’s perceptron is based on the McCulloch and Pitts neuron model. The model
consists of a linear combiner followed by a hard limiter
• The weighted sum of the inputs is applied to the hard limiter, which produces an output equal to +1 if its
input is positive and 1 if its input is negative

COMPILED BY: ER. SANTOSH PANDEYA 14


Contd…..

 Perceptron is one of the earliest learning systems.


 It can be regarded as the triangle classifier.
 Mark I perceptron was capable of making binary decisions.
 The idea was to start from a binary image (in the form of Pixels; zeros or ones)
 A number of associator units receive inputs from different sets of image pixels and produce binary
 outputs in the following levels.

 The perceptron uses an error correction learning algorithm in which the weights are modified after
 erroneous decisions as follows:

a) If the decision is correct, the weights are not adjusted


b) If the erroneous decision is the rth one in the list of possible decisions and the correct decisions is the sth one, some
values aj are deducted from the weights wrj and added to the weight wsj (j=1,2,…, n)

COMPILED BY: ER. SANTOSH PANDEYA 15


Contd…

Fig: Single Layer two input Perceptron

COMPILED BY: ER. SANTOSH PANDEYA 16


Contd…

The aim of the perceptron is to classify inputs, 𝑥1,𝑥2,𝑥3,𝑥4,…,𝑥𝑛.Into one of two classes, say 𝐴1, 𝐴2

In the case of an elementary perceptron, the n dimensional space is divided into a hyperplane into two decision
regions.

The hyperplane is defined by linearly separable function

COMPILED BY: ER. SANTOSH PANDEYA 17


COMPILED BY: ER. SANTOSH PANDEYA 18
How does the perceptron learn its classification tasks?
 This is done by making small adjustment in the weights to reduce the difference between the actual and desired
outputs of the perceptron The initial weights are randomly assigned,
 usually in the range 0 5 0 5 and then updated to obtain the output consistent with the training examples

 If at iteration p, the actual output is 𝑌(𝑝)𝑎𝑛𝑑the desired output is 𝑌𝑑(𝑝),then the error is
 given by
 𝑒𝑝=𝑌𝑑(𝑝 𝑌(𝑝 where p 1 2 3

 Iteration p here refers to the pth training example presented to the perceptron

 If the error, 𝑒𝑝is positive, we need to increase perceptron output 𝑌(𝑝 but if it is negative,
 we need to decrease 𝑌(𝑝)

COMPILED BY: ER. SANTOSH PANDEYA 19


Perceptron Learning Rule

Where p = 1, 2, 3, …

α is the learning rate , a positive constant less than unity The perceptron learning rule was first proposed by
Rosenblatt in 1960. Using this rule we can derive
perceptron training algorithm for classification task

COMPILED BY: ER. SANTOSH PANDEYA 20


Perceptron training algorithm
Step 1 : Initialization

 Set initial weights 𝑤1, 𝑤2, …, 𝑤𝑛and threshold θ to random numbers in the range [ 0.5, +0.5].

 If error 𝑒𝑝is positive, we need to increase perceptron output 𝑌𝑝, but if it is negative, we need to decrease 𝑌𝑝

Step 2 : Activation
Activate the perceptron by applying inputs 𝑥1( 𝑥2( p),…, 𝑥𝑛(𝑝)and desired output 𝑌𝑑 p)
Calculate the actual output at iteration p = 1

Where ,
n is the number of the perceptron inputs, and step is activation
function
COMPILED BY: ER. SANTOSH PANDEYA 21
Step 3: Weight Training
Update the weights of the perceptron

Where Δ𝑤𝑖(𝑝)is the weight correction at iteration p . The weight correction is computed by the delta rule:

Step 4: Iteration
Increase iteration p by 1, go back to Step 2 and repeat the process until convergence

COMPILED BY: ER. SANTOSH PANDEYA 22


Perceptron training algorithm (Pseudocode)
inputs:
examples , a set of examples, each with input x = x 1 , ………., x n and output y network ,
a perceptron with weights W
j , j = 0…….n , and activation function g

COMPILED BY: ER. SANTOSH PANDEYA 23


Multilayer Neural Network
 A multi layer perceptron is a feed forward neural network with one or more hidden layers

The network consists of :


• Input Layer
• Hidden Layer
• Output Layer
The input signal is propagated in a forward direction in a layer by layer basis

Fig: A multilayer Perceptron with two hidden layers

COMPILED BY: ER. SANTOSH PANDEYA 24


• The hidden layer “hides” its desired output. Neurons in the hidden layer can not be
observed through the input/output behavior of the network.
• There is no obvious way to know what the desired output of the hidden layer should
be.

• Commercial ANNs incorporate three and sometimes four layers, including one or
two hidden layers. Each layer can contain from 10 to 1000 neurons.
• Experimental neural networks may have five or six layers, including three or four
hidden layers, and utilize millions of neurons.

COMPILED BY: ER. SANTOSH PANDEYA 25


Recursive Neural networks
 Recursive Neural Networks are a type of neural network architecture that is specially designed to process
hierarchical structures and capture dependencies within recursively structured data.
 Unlike traditional feedforward neural networks (RNNs), Recursive Neural Networks or RvNN can efficiently handle
tree-structured inputs which makes them suitable for tasks involving nested and hierarchical relationships.

Working Principals of RvNN


Some of the key-working principals of RvNN is discussed below:

 Recursive Structure Handling: RvNN is designed to handle recursive structures which means it can
naturally process hierarchical relationships in data by combining information from child nodes to form
representations for parent nodes.
 Parameter Sharing: RvNN often uses shared parameters across different levels of the hierarchy which
enables the model to generalize well and learn from various parts of the input structure.
 Tree Traversal: RvNN traverses the tree structure in a bottom-up or top-down manner by simultaneously
updating node representations based on the information gathered from their children.
 Composition Function: The composition function in RvNN combines information from child nodes to
create a representation for the parent node. This function is crucial in capturing the hierarchical
relationships within the data.

COMPILED BY: ER. SANTOSH PANDEYA 26


Gradient Descent
 A gradient is nothing but a derivative that defines the effects on outputs of the function with a little bit of variation
in inputs.
 Gradient descent was initially discovered by "Augustin-Louis Cauchy" in mid of 18th century. Gradient Descent is
defined as one of the most commonly used iterative optimization algorithms of machine learning to train the
machine learning and deep learning models. It helps in finding the local minimum of a function.
 The best way to define the local minimum or local maximum of a function using gradient descent is as follows:

 If we move towards a negative gradient or away from the gradient of the function at the current point, it will give the
local minimum of that function.
 Whenever we move towards a positive gradient or towards the gradient of the function at the current point, we will
get the local maximum of that function.

COMPILED BY: ER. SANTOSH PANDEYA 27


 This entire procedure is known as Gradient Ascent, which is also known as steepest
descent.
 The main objective of using a gradient descent algorithm is to minimize the cost function
using iteration.
 To achieve this goal, it performs two steps iteratively:

 Calculates the first-order derivative of the function to compute the gradient or slope of
that function.
 Move away from the direction of the gradient, which means slope increased from the
current point by alpha times, where Alpha is defined as Learning Rate. It is a tuning
parameter in the optimization process which helps to decide the length of the steps.

COMPILED BY: ER. SANTOSH PANDEYA 28


• It is determined as the derivative of the activation function multiplied by the error at theneuron output
• For Neuron k in the output layer

Where,
y k (p) is the output of neuron k at iteration p, and Xk (p) is the net weighted input of
neuron k at same iteration.

COMPILED BY: ER. SANTOSH PANDEYA 29


Back Propagation

 Learning in a multilayer network proceeds the same way as for a perceptron


 A training set of input patterns is presented to the network
 The network computes its output pattern, and if there is an error –or other word difference between
actual and desired output pattern –the weight are adjusted to reduce the error
 In a back-propagation neural network, the learning algorithm has two phases
 First, a training input pattern is presented to the network input layer. The network propagates the
input pattern from layer to layer until the output pattern is generated by the out layer
 If this pattern is different from the desired output, an error is calculated and then propagated
backwards through the network from the output layer to the input layer. The weights are modified as
the error is propagated

COMPILED BY: ER. SANTOSH PANDEYA 30


Input Signals

Output Signals

COMPILED BY: ER. SANTOSH PANDEYA 31


Backpropagation algorithm
1. Build a network with the chosen number of inputs, hidden and output units.
2.Initialize all the weights to low random values
3. Choose a single training pair at random.
4. Copy the input pattern to the input layer.
5. Cycle the network so that the activations from the inputs generate the activations in the hidden and output
layers.
6. Calculate the error derivative between the output activation and the target output.
7. Back propagate the summed products of the weights and errors in the output layer in order to calculate the
error in the hidden units.
8. Update the weights attached to each unit according to errors in that unit, the output from the unit below it
and the learning parameters until the error is sufficiently low or the network settles.

COMPILED BY: ER. SANTOSH PANDEYA 32


Back Propagation Training Algorithm

Step 1: Initialization
Set all the weights and threshold levels of the network to random
numbers uniformly distributed inside a small range:

( −2.4/𝐹𝑖,+2.4/𝐹𝑖 )

where 𝐹𝑖is the total number of inputs of neuron 𝑖in the network. The weight initialization is
done on a neuron-by-neuron basis.

COMPILED BY: ER. SANTOSH PANDEYA 33


Step 2: Activation
• Activate the back-propagation neural network by applying inputs 𝑥1𝑝,𝑥2𝑝,…, 𝑥𝑛𝑝 and
desired outputs 𝑦𝑑,1𝑝,𝑦𝑑,2𝑝,…, 𝑦𝑑,𝑛𝑝
• A) Calculate the actual output of the neurons in the hidden layers:

Where n is the number of inputs of neuron j in the hidden layer, and sigmoid is the sigmoid activation function
b) Calculate the actual outputs of the neurons in the output layer:

Where m is the number of inputs of neuron k in the output layer.


3

COMPILED BY: ER. SANTOSH PANDEYA 34


Step 3: weight Training
• Update the weights in the back-propagation network propagating backward the errors associated with output
neurons.
• (a) Calculate the error gradient for the neurons in the output layer:

(b) Calculate the error gradient for the neurons in the hidden
layer:

COMPILED BY: ER. SANTOSH PANDEYA 35


Step 4: Iteration
 Increase iteration p by one, go back to Step 2 and repeat the process until the selected
error criterion is satisfied.
 As an example, we may consider the three layer back-propagation network. Suppose
that the network is required to perform logical operation Exclusive-OR. Recall that a
single-layer perceptron could not do this operation. Now we will apply the three layer
net.

COMPILED BY: ER. SANTOSH PANDEYA 36


inputs: examples, a set of examples, each with input vector x and output vector y
network, a multilayer network with L layers, weights Wj, i, activation function g
repeat
for each e in examples do
for each node j in the input layer do ajxj[e]
For l = 2 to M do
ini∑j Wj,iaj
ai g(ini)
for each node iin the output layer do
∆ig`(ini) *(yi[e] -ai)
For l = M -1 to1 do
for each node jin layer l do
∆j g`(inj ) ∑ iWj ,i ∆i
for each node I in layer l+1 do
Wj,i Wj,i + α *aj*∆i
Until some stopping criterion is satisfied

COMPILED BY: ER. SANTOSH PANDEYA 37


Example: Three-layer network for solving the Exclusive-
OR operation
• The effect of the threshold applied to a neuron in the hidden layer is represented by its weight, 𝜃, connected to a fixed
input equal to -1
• The initial weights and threshold levels are set randomly as follows:

COMPILED BY: ER. SANTOSH PANDEYA 38


Example: Three-layer network for solving the Exclusive-
OR operation
• We consider a training set where inputs 𝑥1,and 𝑥2are equal to 1 and desired output 𝑦𝑑,5is 0. The actual
output of neurons 3 and 4 in the hidden layers are calculated as:

Now the actual output of neuron 5 in the output layer is determined as:

Thus, the following error is obtained:

COMPILED BY: ER. SANTOSH PANDEYA 39


• The next step is weight training. To update the weights and threshold levels in our network, we propagate the error, ℯ,
from the output layer backward to the input
• First, we calculate the error gradient for neuron 5 in the output layer

• Then we determine the weight corrections assuming that the learning rate parameter, 𝛼, is equal to 0.1

COMPILED BY: ER. SANTOSH PANDEYA 40


• Now we calculate the error gradients for neuron 3 and 4 in the hidden layer:

• We, then, determine the weight corrections:

COMPILED BY: ER. SANTOSH PANDEYA 41


• At last, we update all weights and thresholds
• The training process is updated till the sum of squared error is less than 0.001

COMPILED BY: ER. SANTOSH PANDEYA 42


• The Final results of three layer network learning is:

COMPILED BY: ER. SANTOSH PANDEYA 43


Assignment-04

1. Define Neural Networks . How does ANN works


2. How does the structure of a biological neural network influence its function?
3. What is a perceptron, and how does it function in a neural network?
4. What is gradient descent, and how is it used in training neural networks?
5. What are the main challenges associated with gradient descent in neural networks?
6. How does the learning rate affect the performance of gradient descent?
7. What is backpropagation, and why is it important in neural network training?
8. What are the common issues faced during backpropagation, and how can they be mitigated?
9. How does backpropagation help in minimizing the error in neural networks?

-Must be submitted within 7 days

Thank You !!
COMPILED BY: ER. SANTOSH PANDEYA 44

You might also like