NN Lecture Notes
NN Lecture Notes
bias
In order to have some numbers
to work with, here are the initial
weights, the biases, and training
inputs/outputs
The goal of backpropagation is to
optimize the weights so that the
neural network can learn how to
correctly map arbitrary inputs to
outputs.
The Forward Pass
• We can now calculate the error for each output neuron using the
squared error function and sum them to get the total error
𝟏 𝟐
• 𝑬𝒕𝒐𝒕𝒂𝒍 = 𝒕𝒂𝒓𝒈𝒆𝒕 − 𝒐𝒖𝒕𝒑𝒖𝒕
𝟐
• The 1/2 is included so that exponent is cancelled when we
differentiate later on.
For example, the target output for O1 is 0.01 but the neural network
output 0.75136507, therefore its error is
𝟏 𝟐 𝟏
𝑬𝒐𝒖𝒕𝟏 = 𝒕𝒂𝒓𝒈𝒆𝒕𝑶𝟏 − 𝒐𝒖𝒕𝒑𝒖𝒕𝑶𝟏 = 𝟎. 𝟎𝟏 − 0.75136507 𝟐 =0.274811083
𝟐 𝟐
𝑬𝒐𝒖𝒕𝟏 = 0.274811083
Class Please Calculate Your Self
• Repeating this process for O2 (remembering that the target is 0.99)
For example, the target output for O2 is 0.99 but the neural network
output 0.772928465, therefore its error is
𝟏 𝟐 𝟏 𝟐
𝑬𝒐𝒖𝒕𝟐 = 𝒕𝒂𝒓𝒈𝒆𝒕𝑶𝟐 − 𝒐𝒖𝒕𝒑𝒖𝒕𝑶𝟐 = 𝟎. 𝟗𝟗 − 0.772928465 =0.023560026
𝟐 𝟐
𝑬𝒐𝒖𝒕𝟐 = 0.023560026
The total error for the neural network is the sum of these errors
• Through which the weight is passed, you can calculate how a change
in weight affects a change in Error by first calculating how a change
in activation affects a change in Error, and how a change
in weight affects a change in activation.
U Know What ???
The essence of learning in deep
learning is nothing more than
that: adjusting a model’s weights
in response to the error it
produces.
By applying the chain rule
Visually, here’s what we’re doing
• We need to figure out each piece in this equation
• First, how much does the total error change with respect to the
output?
Next, how much does the output of O1 change
with respect to its total net input
• The partial derivative of the logistic function is the output multiplied
by 1 minus the output:
Finally, how much does the total net input of
O1 change with respect to w5?
Putting it all together
To decrease the error, we then
subtract this value from the current
weight (optionally multiplied by
some learning rate, eta, which we’ll
set to 0.5):
We can repeat this process to get
the new weights w6, w7, and w8
So Here are the New Values
• W6 = 0.408666186
• W7 = 0.511301270
• W8 = 0.561370121
• Next, we’ll continue the backwards pass by calculating new values for
w1, w2, w3, and w4.
Visually
Starting with : =