Learning in A Feed Forward Multiple Layer ANN - Backpropagation
Learning in A Feed Forward Multiple Layer ANN - Backpropagation
L8 Convolutional Neural
Networks (CNN)
L9 Deep Learning and Development of
recent developments the ANN field
L10 Tutorial on assignments
Learning in a Feed Forward
Multi Layer ANN:
Backpropagation
This lecture will focus on the feed forward multi layer network
case with a learning mechanism called Backpropagation
AD exploits the fact that every computer program, no matter how complicated,
executes a sequence of elementary arithmetic operations (addition, subtraction,
multiplication, division, etc.) and elementary functions (exp, log, sin, cos, etc.).
By applying the chain rule repeatedly to these operations, derivatives of arbitrary order
can be computed automatically, accurately and efficiently.
The combined Feed Forward and Backpropagation mechanisms
0.1. Specify the structure of the multiple layer ANN, incuding inputs and outputs.
0.2. Specify the biases (modelled as extra inputs) and activation functions for all neurons
0.3 Specify the learning rate
1. Feed forward the input values across the network using the Single Neuron model for each neuron
to calculate the output values in all layers
If Sum wij * xi > 0 the output of unit j Yj is calculated as = f ( Sum (wij * xi) ) otherwise Y=0
i=0..n
2. Calculate the error sensitivity at the output node dj = (Tj– Yj) * d f(e)/de
3. Backpropagate the errors across the network using the same weights as for the feed forward phase:
di = Sum (wij * dj)
j= all outputs of i
4. Update the weights using the backpropagation weight updating formula: wij = wij + ( a * dj * dFj(e)/de * xi)
DISCLAIMER: In the example, the neuron functionality is very simplified and partly unrealistic in the following senses:
- The activation function is the identity function which means that the gradients become trivially = 1
- The learning rate = 1
- The bias are disconsidered to simplify the graphs (but could be easily included as shown in the single neuron lecture.
This neural network has two inputs, one output and 6 other units in three layers.
Feed forward from input to output layer
Error estimation
T-Y
T
Backpropagation of errors
The error sensitivity d is propagated back from the output layer to all neurons in the
preceding layer, the output signals of which were input to the considered output neuron.
However the errors of the preceding layer neurons are calculated by applying exactly the
same weights as were used in the forward feeding phase.
The same procedure is applied on all boundaries between layers, back to the input layer.
Re-calculation of weights
When the error signal for each neuron have been computed, the weights coefficients of each
neuron input node can be modified.
1
11
0.5 y = sum xi*wi in all steps
0.5 y=2.25 i
0.5
X1=1 4
y=1.5 0.5 0.5 y=2.25
0.5
2 0.5 6 Y=2.25, Z=3
y=2.25
0.5 D=0.75
0.5 0.5 0.5
5
X2=2 0.5
0.5
0.5 3
y=1.5
Example: Backpropagation of error
Bias=0
Learning rate= 1
D1 =0.5*0.375+0.5*0.375=0.375 Activation function = identity function
11
1
0.5 11
0.5
0.5 D4 =0.5*0.75=0.375
X1=1 0.5 4
D2 =0.375
0.5
Z=3
0.5 2 0.5 6 Y=2.25
D=0.75
0.5
0.5 0.5 0.5
5
X2=2 0.5 D5 =0.5*0.75=0.375
0.5
0.5 3
D3 =0.5*0.375+0.5*0.375=0.375
Example: Re-calculation of weights
Bias=0
y=1.5 Learning rate= 1
D1 =0.375 Activation function = identity function
11
1 W´=0.5+1.5*0.375=1.055
0.5 11
W´=0.5+1*0.375=0.875 0.5
0.5 D4 =0.375
X1=1 W´=1.25 0.5 4 y=2.25
D2 =0.375 W´= ?
0.5
W´=0.875
y=1.5 W´=1.055
0.5 2 0.5 6 Z=3
W´=1.25
0.5 Y=2.25
0.5 0.5 0.5 W´= ?
W´=0.875 5 D=0.75
X2=2 0.5 D5 =0.375
y=2.25
0.5
W´=0.5+2*0.375=1.25 W´=0.5+1.5*0.375=1.055
3
0.5 D3 =0.375
y=1.5
An Analogy to Backpropagation in ANN
Bucket Brigade Algorithm
for Genetic Algorithms/Classifier systems
A classifier system is a rule-based system architecture where the adaption of the rule-base is
based on a genetic algorithm architecture.
In the genetic algorithm terminology a rule is considered as a gene in the gene pool and is
thereby automatically provided with a fitness value. The fitness of a rule decides its destiny
in the genetic selection process.
Many rules contribute to each solution (typically in a hierachical fashion). The Bucket
Brigade Algorithm allocates credit and blame backwards in the rule application hierarchy
and the alotted award for a particular rule forms the basis for updating its fitness value.
Specific Feed Forward approaches
Radial Basis Function Network