Unit - 4 ANN
Unit - 4 ANN
descent and the Delta rule, Multilayer networks, Derivation of Backpropagation Algorithm,
Generalization
ANN:
A neural network is a machine learning algorithm based on the model of a human neuron. An
Artificial Neural Network is an information processing technique. It works like the way human
brain processes information. ANN includes a large number of connected processing units that
work together to process information. They also generate meaningful results from it.
A neural network may contain the following 3 layers:
Input layer – The activity of the input units represents the raw information that can feed
into the network.
Hidden layer – To determine the activity of each hidden unit. The activities of the input
units and the weights on the connections between the input and the hidden units. There
may be one or more hidden layers.
Output layer – The behavior of the output units depends on the activity of the hidden
units and the weights between the hidden and output units.
ALVINN:
PERCEPTRONS:
Where, each is a real-valued constant, or weight, that determines the contribution of input to
the perceptron output.
Although the perceptron rule finds a successful weight vector when the training examples are
linearly separable, it can fail to converge if the examples are not linearly separable. A second
training rule, called the delta rule, is designed to overcome this difficulty.
The key idea behind the delta rule is to use gradient descent to search the hypothesis space of
possible weight vectors to find the weights that best fit the training examples. This rule is
important because gradient descent provides the basis for the BACKPROPAGATION algorithm,
which can learn networks with many interconnected units.
Single perceptrons can only express linear decision surfaces. In contrast, the kind of multilayer
networks learned by the BACKPROPAGATION algorithm are capable of expressing a rich
variety of nonlinear decision surfaces.
For example, a typical multilayer network and decision surface is depicted in Figure 4.5. Here
the speech recognition task involves distinguishing among 10 possible vowels, all spoken in the
context of "h-d" (i.e., "hid," "had," "head," "hood," etc.). The input speech signal is represented
by two numerical parameters obtained from a spectral analysis of the sound, allowing us to
easily visualize the decision surface over the two-dimensional instance space. As shown
in the figure, it is possible for the multilayer network to represent highly nonlinear
decision surfaces that are much more expressive than the linear decision surfaces
of single units.
The BACKPROPAGATION Algorithm:
The BACKPROPAGATION algorithm learns the weights for a multilayer network, given a
network with a fixed set of units and interconnections. It employs gradient descent to attempt
to minimize the squared error between the network output values and the target values for
these outputs.
Derivation of Backpropogation algorithm:
Convergence and Local Minima:
Generalization:
The performance of Artificial Neural Networks (ANN) is mostly depends upon its
generalization capability. Generalization of the ANN is ability to handle unseen data. The
generalization capability of the network is mostly determined by system complexity and
training of the network.
The termination condition for the Backpropogation algorithm has been left unspecified.
What is an appropriate condition for termination the weight update loop?
One obvious choice is to continue training until the error E on the training examples
falls below some predetermined threshold. In fact, this is a poor strategy because
BACKPROPAGATION is susceptible to overfitting the training examples at the cost
of decreasing generalization accuracy over other unseen examples.