0% found this document useful (0 votes)
7 views8 pages

Pr3_ANN_WriteUp.docx

The document outlines the process of creating a backpropagation feedforward neural network, detailing the theory behind neural networks, including forward and backward propagation. It explains the steps involved in training a multi-layer perceptron (MLP) using gradient descent to minimize error, and provides a Python code example for implementation. The methodology includes initializing weights and biases, performing linear and non-linear transformations, and updating weights and biases through multiple training iterations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views8 pages

Pr3_ANN_WriteUp.docx

The document outlines the process of creating a backpropagation feedforward neural network, detailing the theory behind neural networks, including forward and backward propagation. It explains the steps involved in training a multi-layer perceptron (MLP) using gradient descent to minimize error, and provides a Python code example for implementation. The methodology includes initializing weights and biases, performing linear and non-linear transformations, and updating weights and biases through multiple training iterations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Practical No.

: 3

Title: Write a program for producing back propagation feed forward network

Objectives: To learn the backpropagation Network

Theory:

Basics of Neural Networks

Introduction:

Neural networks work in a very similar manner. It takes several inputs, processes it through
multiple neurons from multiple hidden layers, and returns the result using an output layer. This
result estimation process is technically known as “Forward Propagation“.

Next, we compare the result with actual output. The task is to make the output to the neural
network as close to the actual (desired) output. Each of these neurons is contributing some
error to the final output. How do you reduce the error?

We try to minimize the value/ weight of neurons that are contributing more to the error and
this happens while traveling back to the neurons of the neural network and finding where the
error lies. This process is known as “Backward Propagation“.

In order to reduce this number of iterations to minimize the error, the neural networks use a
common algorithm known as “Gradient Descent”, which helps to optimize the task quickly and
efficiently.

Multi-layer perceptron

An MLP consists of multiple layers called Hidden Layers stacked in between the Input Layer and
the Output Layer as shown below.
The image above shows just a single hidden layer in green but in practice can contain multiple
hidden layers. In addition, another point to remember in case of an MLP is that all the layers are
fully connected i.e every node in a layer(except the input and the output layer) is connected to
every node in the previous layer and the following layer.

Let’s move on to the next topic which is a training algorithm for neural networks (to minimize
the error). Here, we will look at the most common training algorithms known as Gradient
descent.

Steps involved in Neural Network methodology

Let’s look at the step by step building methodology of Neural Network (MLP with one hidden
layer, similar to above-shown architecture). At the output layer, we have only one neuron as we
are solving a binary classification problem (predict 0 or 1). We could also have two neurons for
predicting each of both classes.
Firstly look at the broad steps:

0.) We take input and output

●​ X as an input matrix
●​ y as an output matrix

1.) Then we initialize weights and biases with random values (This is one-time initiation. In the
next iteration, we will use updated weights, and biases). Let us define:

●​ wh as a weight matrix to the hidden layer


●​ bh as bias matrix to the hidden layer
●​ wout as a weight matrix to the output layer
●​ bout as bias matrix to the output layer

2.) Then we take matrix dot product of input and weights assigned to edges between the input
and hidden layer then add biases of the hidden layer neurons to respective inputs, this is known
as linear transformation:

hidden_layer_input= matrix_dot_product(X,wh) + bh

3) Perform non-linear transformation using an activation function (Sigmoid). Sigmoid will return
the output as 1/(1 + exp(-x)).

hiddenlayer_activations = sigmoid(hidden_layer_input)

4.) Then perform a linear transformation on hidden layer activation (take matrix dot product
with weights and add a bias of the output layer neuron) then apply an activation function (again
used sigmoid, but you can use any other activation function depending upon your task) to
predict the output

output_layer_input = matrix_dot_product (hiddenlayer_activations * wout ) + bout​


output = sigmoid(output_layer_input)​

All the above steps are known as “Forward Propagation“

5.) Compare prediction with actual output and calculate the gradient of error (Actual –
Predicted). Error is the mean square loss = ((Y-t)^2)/2
E = y – output

6.) Compute the slope/ gradient of hidden and output layer neurons ( To compute the slope, we
calculate the derivatives of non-linear activations x at each layer for each neuron). The gradient
of sigmoid can be returned as x * (1 – x).

slope_output_layer = derivatives_sigmoid(output)​
slope_hidden_layer = derivatives_sigmoid(hiddenlayer_activations)

7.) Then compute change factor(delta) at the output layer, dependent on the gradient of error
multiplied by the slope of output layer activation

d_output = E * slope_output_layer

8.) At this step, the error will propagate back into the network which means error at the hidden
layer. For this, we will take the dot product of the output layer delta with the weight parameters
of edges between the hidden and output layer (wout.T).

Error_at_hidden_layer = matrix_dot_product(d_output, wout.Transpose)

9.) Compute change factor(delta) at hidden layer, multiply the error at hidden layer with slope
of hidden layer activation

d_hiddenlayer = Error_at_hidden_layer * slope_hidden_layer

10.) Then update weights at the output and hidden layer: The weights in the network can be
updated from the errors calculated for training example(s).

wout = wout + matrix_dot_product(hiddenlayer_activations.Transpose,


d_output)*learning_rate​
wh = wh + matrix_dot_product(X.Transpose,d_hiddenlayer)*learning_rate

learning_rate: The amount that weights are updated is controlled by a configuration parameter
called the learning rate)

11.) Finally, update biases at the output and hidden layer: The biases in the network can be
updated from the aggregated errors at that neuron.

●​ bias at output_layer =bias at output_layer + sum of delta of output_layer at row-wise *


learning_rate
●​ bias at hidden_layer =bias at hidden_layer + sum of delta of output_layer at row-wise *
learning_rate

bh = bh + sum(d_hiddenlayer, axis=0) * learning_rate​


bout = bout + sum(d_output, axis=0)*learning_rate

Steps from 5 to 11 are known as “Backward Propagation“

One forward and backward propagation iteration is considered as one training cycle. As
mentioned earlier, when we train second time then update weights and biases are used for
forward propagation.

Above, we have updated the weight and biases for the hidden and output layer and we have
used a full batch gradient descent algorithm.

If we will train the model multiple times then it will be a very close actual outcome.

Python Code:

Back propagation algorithm- code

# importing the library

import numpy as np

# creating the input array

X=np.array([[1,0,1,0],[1,0,1,1],[0,1,0,1]])

print ('\n Input:')

print(X)

# creating the output array


y=np.array([[1],[1],[0]])

print ('\n Actual Output:')

print(y)

# defining the Sigmoid Function

def sigmoid (x):

return 1/(1 + np.exp(-x))

# derivative of Sigmoid Function

def derivatives_sigmoid(x):

return x * (1 - x)

# initializing the variables

epoch=5000 # number of training iterations

lr=0.1 # learning rate

inputlayer_neurons = X.shape[1] # number of features in data set

hiddenlayer_neurons = 3 # number of hidden layers neurons

output_neurons = 1 # number of neurons at output layer

# initializing weight and bias

wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))

bh=np.random.uniform(size=(1,hiddenlayer_neurons))

wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))

bout=np.random.uniform(size=(1,output_neurons))
# training the model

for i in range(epoch):

#Forward Propagation

hidden_layer_input1=np.dot(X,wh)

hidden_layer_input=hidden_layer_input1 + bh

hiddenlayer_activations = sigmoid(hidden_layer_input)

output_layer_input1=np.dot(hiddenlayer_activations,wout)

output_layer_input= output_layer_input1+ bout

output = sigmoid(output_layer_input)

#Backpropagation

E = y-output

slope_output_layer = derivatives_sigmoid(output)

slope_hidden_layer = derivatives_sigmoid(hiddenlayer_activations)

d_output = E * slope_output_layer

Error_at_hidden_layer = d_output.dot(wout.T)

d_hiddenlayer = Error_at_hidden_layer * slope_hidden_layer

wout += hiddenlayer_activations.T.dot(d_output) *lr

bout += np.sum(d_output, axis=0,keepdims=True) *lr

wh += X.T.dot(d_hiddenlayer) *lr

bh += np.sum(d_hiddenlayer, axis=0,keepdims=True) *lr


print ('\n Output from the model:')

print (output)

output:

You might also like