Unit 1 Notes
Unit 1 Notes
UNIT - I
Network Topology
Adjustments of Weights or Learning
Activation Functions
In this chapter, we will discuss in detail about these three building blocks of ANN
Network Topology
A network topology is the arrangement of a network along with its nodes and
connecting lines. According to the topology, ANN can be classified as the following
kinds −
Feedforward Network
Feedback Network
As the name suggests, a feedback network has feedback paths, which means
the signal can flow in both directions using loops. This makes it a non-linear
dynamic system, which changes continuously until it reaches a state of
equilibrium. It may be divided into the following types −
Supervised Learning
As the name suggests, this type of learning is done under the supervision of
a teacher. This learning process is dependent.
During the training of ANN under supervised learning, the input vector is
presented to the network, which will give an output vector. This output vector is
compared with the desired output vector. An error signal is generated, if there is a
difference between the actual output and the desired output vector. On the basis
of this error signal, the weights are adjusted until the actual output is matched
with the desired output.
Unsupervised Learning
As the name suggests, this type of learning is done without the supervision
of a teacher. This learning process is independent.
During the training of ANN under unsupervised learning, the input vectors
of similar type are combined to form clusters. When a new input pattern is applied,
then the neural network gives an output response indicating the class to which the
input pattern belongs.
There is no
feedback from
the environment
as to what should
be the desired output
Prepared by: Dr.B.GOPINATH, PROF/ECE Page 4 of 33
CCS355 - NEURAL NETWORK AND DEEP LEARNING
and if it is correct or incorrect. Hence, in this type of learning, the network itself
must discover the patterns and features from the input data, and the relation for
the input data over the output.
Reinforcement Learning
Activation Functions
It may be defined as the extra force or effort applied over the input to obtain an
exact output. In ANN, we can also apply activation functions over the input to get
the exact output. Followings are some activation functions of interest −
F(x)= x
You have to go down, but you hardly see anything, maybe just a few meters.
Your task is to find your way down, but you cannot see the path. You can use
the method of gradient descent. This means that you are examining the
steepness at your current position. You will proceed in the direction with the
steepest descent.
You take only a few steps and then you stop again to reorientate yourself. This
means you are applying again the previously described procedure, i.e. you are
looking for the steepest descend.
Keeping going like this will enable you to arrive at a position where there is no
further descend (ie each direction goes upwards). You may have reached the
deepest level (global minimum), but you could be stuck in a basin or something. If
you start at the position on the right side of our image, everything works out fine,
Prepared by: Dr.B.GOPINATH, PROF/ECE Page 8 of 33
CCS355 - NEURAL NETWORK AND DEEP LEARNING
but from the left-side, you will be stuck in a local minimum. In summary, if you
are dropped many times at random places on this theoretical island, you will find
ways downwards to sea level. This is what we actually do when we train a neural
network.
We have labels, i.e. target or desired values t for each output value o.
The error is the difference between the target and the actual output:
We will later use a squared error function, because it has better characteristics
for the algorithm.
We will have a look at the output value o1, which is depending on the
values w11, w21, w31 and w41. Let’s assume the calculated value (o1) is 0.92 and
the desired value (t1) is 1. In this case the error is
The total error in our weight matrix between the hidden and the output layer looks
like this:
The denominator in the left matrix is always the same (scaling factor).
The history of ANN can be divided into the following three eras −
1943 − It has been assumed that the concept of neural network started with
the work of physiologist, Warren McCulloch, and mathematician, Walter
Pitts, when in 1943 they modeled a simple neural network using electrical
circuits in order to describe how neurons in the brain might work.
1949 − Donald Hebb’s book, The Organization of Behavior, put forth the fact
that repeated activation of one neuron by another increases its strength
each time they are used.
1956 − An associative memory network was introduced by Taylor.
1958 − A learning method for McCulloch and Pitts neuron model named
Perceptron was invented by Rosenblatt.
The historical review shows that significant progress has been made in this field.
Neural network based chips are emerging and applications to complex problems
are being developed. Surely, today is a period of transition for neural network
technology.
Biological Neuron
Schematic Diagram
Working
As shown in the above diagram, a typical neuron consists of the following four
parts with the help of which we can explain its working −
Soma Node
Dendrites Input
Axon Output
The following table shows the comparison between ANN and BNN based on some
criteria mentioned.
1011 neurons and 102 to 104 nodes mainly depends on the type of ap
Size
1015 interconnections And network design erℎ
Fault Performance degrades with even It is capable of robust performance, hence has
tolerance partial damage the potential to be fault tolerant
The following diagram represents the general model of ANN followed by its
processing.
Supervised Learning
Perceptron
Training Algorithm
Prepared by: Dr.B.GOPINATH, PROF/ECE Page 19 of 33
CCS355 - NEURAL NETWORK AND DEEP LEARNING
Perceptron network can be trained for single output unit as well as multiple output
units.
Weights
Bias
Learning rate α
For easy calculation and simplicity, weights and bias must be set equal to 0 and
the learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
Case 2 − if y = t then,
wi(new)=wi(old)
b(new)=b(old)
Here ‘y’ is the actual output and ‘t’ is the desired/target output.
Step 8 − Test for the stopping condition, which would happen when there is no
change in weight.
The following diagram is the architecture of perceptron for multiple output classes.
Step 1−
Initialize the following to start the training −
Weights
Bias
Learning rate α�
For easy calculation and simplicity, weights and bias must be set equal to 0 and
the learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
Case 1 − if yj ≠ tj then,
wij(new)=wij(old)+αtjxi
bj(new)=bj(old)+αtj
Case 2 − if yj = tj then,
wij(new)=wij(old)
bj(new)=bj(old)
Here ‘y’ is the actual output and ‘t’ is the desired/target output.
Step 8 − Test for the stopping condition, which will happen when there is no
change in weight.
Case 1 − if yj ≠ tj then,
wij(new)=wij(old)+αtjxi
bj(new)=bj(old)+αtj
Case 2 − if yj = tj then,
wij(new)=wij(old)
bj(new)=bj(old)
Here ‘y’ is the actual output and ‘t’ is the desired/target output.
Step 8 − Test for the stopping condition, which will happen when there is no
change in weight.
Architecture
Training Algorithm
Weights
Bias
Learning rate α�
For easy calculation and simplicity, weights and bias must be set equal to 0 and
the learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
Step 3 − Continue step 4-6 for every bipolar training pair s:t.
Step 8 − Test for the stopping condition, which will happen when there is no
change in weight or the highest weight change occurred during training is smaller
than the specified tolerance.
Madaline which stands for Multiple Adaptive Linear Neuron, is a network which
consists of many Adalines in parallel. It will have a single output unit. Some
important points about Madaline are as follows −
It is just like a multilayer perceptron, where Adaline will act as a hidden unit
between the input and the Madaline layer.
The weights and the bias between the input and Adaline layers, as in we see
in the Adaline architecture, are adjustable.
The Adaline and Madaline layers have fixed weights and bias of 1.
Training can be done with the help of Delta rule.
Architecture
Training Algorithm
Prepared by: Dr.B.GOPINATH, PROF/ECE Page 26 of 33
CCS355 - NEURAL NETWORK AND DEEP LEARNING
By now we know that only the weights and bias between the input and the Adaline
layer are to be adjusted, and the weights and bias between the Adaline and the
Madaline layer are fixed.
Weights
Bias
Learning rate α�
For easy calculation and simplicity, weights and bias must be set equal to 0 and
the learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
Step 3 − Continue step 4-7 for every bipolar training pair s:t.
Architecture
has bias, whose weight is always 1, on them. As is clear from the diagram, the
working of BPN is in two phases. One phase sends the signal from the input layer
to the output layer, and the other phase back propagates the error from the output
layer to the input layer.
Training Algorithm
For training, BPN will use binary sigmoid activation function. The training of BPN
will have the following three phases.
Weights
Learning rate α
For easy calculation and simplicity, take some small random values.
Prepared by: Dr.B.GOPINATH, PROF/ECE Page 30 of 33
CCS355 - NEURAL NETWORK AND DEEP LEARNING
Step 2 − Continue step 3-11 when the stopping condition is not true.
Phase 1
Step 4 − Each input unit receives input signal xi and sends it to the hidden unit
for all i = 1 to n
Step 5 − Calculate the net input at the hidden unit using the following relation −
PHASE-2