We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 14
Backpropagation Step by Step
If you are building your own neural network, you will definitely need to understand how to train it. Backpropagation is
a commonly used technique for training neural network. There are many resources explaining the technique, but this
post will explain backpropagation with concrete example in a very detailed colorful steps.
You can see visualization of the forward pass and backpropagation here. You can build your neural network
using netflow.js
Overview
In this post, we will build a neural network with three layers:
Input layer with two Inputs neurons
One hidden layer with two neurons
Output layer with a single neuronWeights, weights, weights
Neural network trai about finding weights that minimize prediction error. We usually start our training with a set
of randomly generated weights.Then, backpropagation is used to update the weights in an attempt to correctly map
arbitrary inputs to outputs.
Our initial weights will be as following: wa -
2b, 3-60.12, wi = 0.08, ws = 0.24 and ws = 0.15Dataset
Our dataset has one sample with two inputs and one output.
Our single sample is as following inputs=[2, >] and outpute[1)Forward Pass
We will use given weights and inputs to predict the output. Inputs are multiplied by weights; the results are then
passed forward to next layer0.14]
a} wee pire
021 poal7l0.85 0.48]-[} 14] = [0.199]
ba M4 2x21 85x.14448x.19= 19
Calculating Error
Now, it’s time to find out how our network performed by calculating the difference between the actual output and
predicted one. It’s clear that our network output, or prediction, is not even close to actual output. We can calculate
the difference or the error as followingError 3{0.191- Lop 0.327
Reducing Error
(Our main goal of the training is to reduce the error or the difference between prediction and actual output.
Since actual output is constant, “not changing’, the only way to reduce the error is to change prediction value. The
question now is, how to change prediction value?
By decomposing prediction into its basic elements we can find that weights are the variable elements
ion value, we need to change weights values,
affecting prediction value. In other words, in order to change prediprediction = out
prediction =(h,) wo +(h,)w, 7st
prediction = (|, w, +), w,) w;+(i, w3+), w,) We
‘The question now is how to change\update the weights value so that the error is reduced?
‘The answer is Backpropagation!
Backpropagation
Backpropagation, short for "backward propagation of errors", is a mechanism used to update
the weights using gradient descent. It calculates the gradient of the error function with respect to the neural
network's weights. The calculation proceeds backwards through the network.
Gradient descent is an iterative optimization algorithm for finding the minimum of a function; in our case we
want to minimize th error function, To find a local minimum of a function using gradient descent, one takes
steps proportional to the negative of the gradient of the function at the current point.For example, to update ws , we take the current x6 and subtract the partial derivative of error function with respect
to we. Optionally, we multiply the derivative of the error function by a selected number to make sure that the new
updated weight is minimizing the error function; this number is called learning rate,
OError.
We= We-2 Core")
The derivation of the error function is evaluated by applying the chain rule as followingderrer __aError prediction
‘aW, — dprediction aw,
prediction =(1.w;-+ wa) wvs+(.
cactutla)", (|, Wy |, Wz) Wa + (i Wy) Wa) wy,
aw, aw,
ae Alpredictoinmactulady a
= 2"; (predictoin — actula) ee (Wot ow) sms
= (predictoin — actula)* (b,) ‘a predletion- actual fe!
Error
Bh,
.
W,= W,-2 Bh,
Similarly, we can derive the update formula for »s and any other weights existing between the output and the hidden
layer.
W,= W,-2 Bh,However, when moving backward to update wi, «2, «2? and ut existing between input and hidden layer, the partial
derivative for the error function with respect to x1 , for example, will be as following.
dbrror __dError _ 4 Oprediction 4 dh, =
OW, — dprediction ‘ah, aw,
icaier Ganson
prediction = (b,} ws +(h:) we
M ymediet
lereditein-aet ah) w, +h.) We , 91, Wet
prac oh ms
abvror _5%4¢ edictoi Alprediccoin-actula), (yy ) x
air, 23 (predictoin — actula) ee ens) * (1)
onire" — (predictoin — actula)* (w,.) redietion ~ actual c
error
a AW
We can find the update formula for the remaining weights »2, «3 and wi in the same way.
In summary, the update formulas for all weights will be as following:"We= We—a(h>. A)
‘w,=w,—a(h; . A)
pinted weights “"w,=W,—a(i,. Swe)
“'w5=W3-a (i, . BWe)
*w2= W,—a(i,. Aws)
“Ww, = W,-a (i, . Aws)
wd ed-ealh Jedi)
Re bs Wy al
lw, wi lw, wil
—aal _ [1 Ws) _faiAw, ai,Aw,
2al)]-bus wal = lw, i (isn! al,Aw
Backward Pass
Using derived formulas we can find the new weights.
Learning rate: is a hyperparameter which means that we need to manually guess its value,Q=0.191-1=-0.809 ~
a=0.05 ~
(red = (og) ~ 995-0800) [ 053] =[azel~ [Sone] Lo.
pe
ae (2 12) oos-o80n[2].(a24 ors)=[3! 12] - [2m foe] p2 33
21 os! ~ Looi -oo1s!
Now, using the new weights we will repeat the forward passed> a] JO22 0.13) 15, 0.17] — too, /
Bo Nlo23 oiol* 92 9Sel-[yi7J=B2e) ee
pedasx23285 928.174.5647 _
We can notice that the prediction 0.20 is little bit closer to actual output than the previously predicted
‘one 9.191 . We can repeat the same process of backward and forward pass until error is close or equal to zero.
Backpropagation VisualizationYou can see visualization of the forward pass and backpropagation here.
You can build your neural network using netflow.js
ans
om x2
21 008