ML Session 15 Backpropagation
ML Session 15 Backpropagation
SESSION 15
AIM
To familiarise students with the basic concept of Feed forward neural network based on Learning algorithms and to process of
neural network in various Applications.
INSTRUCTIONAL OBJECTIVES
LEARNING OUTCOMES
• Travel back from the output layer to the hidden layer to adjust the weights such
that the error is decreased.
• Keep repeating the process until the desired output is achieved
A Step by Step BACK PROPAGATION Example
The goal of backpropagation is to optimize the weights so that the neural network can learn how to correctly map arbitrary
inputs to outputs.
To work with a single training set: given inputs 0.05 and 0.10, we want the neural network to output 0.01 and 0.99.
BACK PROPAGATION WORKING METHOD
For example,
2
the target output for I s 0.01 but the neural network output 0.75136507, therefore its error is:
Repeating this process for (r (remembering that the target is 0.99) we get:
The total error for the neural network is the sum of these errors:
The Backwards Pass
Our goal with backpropagation is to update each of the weights in the network so that they cause the actual output to be
closer the target output, thereby minimizing the error for each output neuron and the network as a whole.
Output Layer
1
Consider . we want to know how much a change in affects the total error, .
2
When we take the partial derivative of the total error with respect to , the quantity
becomes zero because does not affect it which means we’re taking the derivative of a constant which is zero.
Next,1 how much does the output of change with respect to its total net input?
The partial derivative of the logistic function is the output multiplied by 1 minus the output:
2
Finally, how much does the total net input of change with respect to ?
Putting it all together:
You’ll often see this calculation combined in the form of the delta rule:
The Delta rule in machine learning and neural network environments is a specific type of backpropagation that
helps to refine connectionist ML/AI networks, making connections between inputs and outputs with layers of
artificial neurons.
In general, backpropagation has to do with recalculating input weights for artificial neurons using a gradient
method. Delta learning does this using the difference between a target activation and an actual obtained
activation. Using a linear activation function, network connections are adjusted.
Another way to explain the Delta rule is that it uses an error function to perform gradient descent learning.
Apply the Delta rule
To decrease the error, we then subtract this value from the current weight (optionally
multiplied by some learning rate, eta, which we’ll set to 0.5):
We can repeat this process to get the new weights , , and :
WEB REFERENCES
[1] https://www.guru99.com/backpropogation-neural-network.html
[2] https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
[3] http://neuralnetworksanddeeplearning.com/chap2.html
15
Self-Assessment Questions
OUR TEAM
TOPICS TO BE COVERED
• A multilayer perceptron (MLP) is a fully connected class of feedforward artificial neural network (ANN).
Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when
they have a single hidden layer.
• An MLP consists of at least three layers of nodes: an input layer, a hidden layer and an output layer. Except
for the input nodes, each node is a neuron that uses a nonlinear activation function.
• MLP utilizes a chain rule based supervised learning technique called backpropagation or reverse mode of
automatic differentiation for training.
• Its multiple layers and non-linear activation distinguish MLP from a linear perceptron. It can distinguish data
that is not linearly separable.
MLP ARCHITECTURE
If a multilayer perceptron has a linear activation function in all neurons, that is, a linear function that
maps the weighted inputs to the output of each neuron, then linear algebra shows that any number of
layers can be reduced to a two-layer input-output model. In MLPs some neurons use a nonlinear
activation function that was developed to model the frequency of action potentials, or firing, of
biological neurons.
The two historically common activation functions are both sigmoids, and are described by
.
The first is a hyberbolic tangent that ranges from -1 to 1, while the other is the logistic function,
which is similar in shape but ranges from 0 to 1.
21
MLP LEARNING ALGORITHM
• The MLP algorithm suggests that the weights are initialised to small random numbers,
both positive and negative.
• If the initial weight values are close to 1 or -1 then the inputs to the sigmoid are also
likely to be close to ±1 and so the output of the neuron is either 0 or 1.
• if we view the values of these inputs as having uniform variance, then the typical input
to the neuron will be w√n, where w is the initialization value of the weights. So a
common trick is to set the weights in the range −1/√n < w < 1/√n, where n is the
number of nodes in the input layer to those weights.
25
THE MULTI-LAYER PERCEPTRON IN PRACTICE
In this section, we are going to look more at choices that can be made about the network in order to use it for
solving real problems to four different types of problem: regression, classification, time-series prediction, and
data compression.
1. Amount of Training Data
For the MLP with one hidden layer there are (L + 1) × M + (M + 1) × N weights, where L, M, N are the number of
nodes in the input, hidden, and output layers, respectively. The extra +1s come from the bias nodes, which also
have adjustable weights.
2. Number of Hidden Layers
Two hidden layers are sufficient to compute for different inputs, and so if the function that we want to learn
(approximate) is continuous, the network can compute it. It can therefore approximate any decision boundary,
not just the linear one that the Perceptron computed.
26
THE MULTI-LAYER PERCEPTRON IN PRACTICE
27
WEB REFERENCES
[1] https://www.guru99.com/backpropogation-neural-network.html
[2] https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
[3] http://neuralnetworksanddeeplearning.com/chap2.html
28
Self-Assessment Questions
(a) Because it can be expressed in a way that allows you to use a neural network Loss function
(b) Because it is complex binary operation that cannot be solved using neural networks Positive Function
(c) Because it can be solved by a single layer perceptron
(d) Because it is the simplest linearly inseparable problem that exists.
2. A perceptron adds up all the weighted inputs it receives, and if it exceeds a certain value, it
outputs a 1, otherwise it just outputs a 0.
a) True
b) False
THANK
YOU
OUR TEAM