0% found this document useful (0 votes)

4 views

4

The document outlines the implementation of the Backpropagation algorithm for deep artificial neural networks (ANNs) with two models: one using sigmoid activations and the other using ReLU activations in hidden layers, both with softmax output. The models are trained on the MNIST dataset for 1000 epochs, with hyperparameters tuned using a validation set. Results show that the ReLU model outperforms the sigmoid model, demonstrating improved accuracy and loss metrics.

Uploaded by

ritammaiti2016

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

4

Uploaded by

ritammaiti2016

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Write Python code to implement the Backpropagation algorithm for ANN with more than

two hidden layers. Develop two such deep models with the following configurations.

Model 1: uses sigmoid activations at the hidden nodes and softmax activation function at
the output nodes.

Model 2: uses ReLu activation function at the hidden nodes and softmax activation
function at the output nodes.

The hyperparameters (e.g. learning rate, momentum, number of hidden layers, number of
hidden nodes per layer) for both models should be properly tuned using a validation set.
Compare the performance of these two models on MNIST dataset when both the models
are trained up to 1000 epochs.

Introduction
Artificial Neural Networks (ANNs) are widely used for various machine learning tasks, including image
classification. One of the most critical learning techniques in ANNs is the backpropagation algorithm,
which adjusts the weights of neurons based on error gradients. In this assignment, we implement
backpropagation for deep ANNs with more than two hidden layers and analyze their performance on the
MNIST dataset.

Backpropagation Algorithm

Backpropagation is an iterative, gradient-based optimization algorithm used to train neural networks. It

consists of the following steps:

1. Forward Propagation: Compute activations of neurons from the input layer to the output layer.

2. Compute Loss: Measure the error between predicted and actual outputs using a loss function
(e.g., categorical cross-entropy for classification).

3. Backward Propagation: Compute the gradients of the loss function with respect to the network's
parameters using the chain rule of differentiation.

4. Weight Update: Adjust weights using a gradient descent-based optimizer, such as Stochastic
Gradient Descent (SGD) or Adam.

Model Configurations

We develop two deep models with different activation functions at the hidden layers:

• Model 1: Uses the sigmoid activation function in the hidden layers and the softmax activation
function in the output layer.

• Model 2: Uses the ReLU activation function in the hidden layers and the softmax activation
function in the output layer.

The hyperparameters, such as learning rate, momentum, number of hidden layers, and number of hidden
nodes per layer, are tuned using a validation set. Both models are trained for 1000 epochs.
Import Required Libraries

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt

# Load MNIST dataset

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0 # Normalize data
y_train, y_test = to_categorical(y_train), to_categorical(y_test)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.

11490434/11490434 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step

Model 1: Sigmoid Hidden Layers

model1 = Sequential([
Flatten(input_shape=(28, 28)),
Dense(256, activation='sigmoid'),
Dense(128, activation='sigmoid'),
Dense(64, activation='sigmoid'),
Dense(10, activation='softmax')
])

model1.compile(optimizer=Adam(learning_rate=0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])

/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWa
super().__init__(**kwargs)

Model 2: ReLU Hidden Layers

model2 = Sequential([
Flatten(input_shape=(28, 28)),
Dense(256, activation='relu'),
Dense(128, activation='relu'),
Dense(64, activation='relu'),
Dense(10, activation='softmax')
])

model2.compile(optimizer=Adam(learning_rate=0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])

Training the Models

history1 = model1.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=10, batch_s

history2 = model2.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=10, batch_s

Epoch 1/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 6s 7ms/step - accuracy: 0.9853 - loss: 0.0506 - val_accurac
Epoch 2/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 8s 9ms/step - accuracy: 0.9891 - loss: 0.0379 - val_accurac
Epoch 3/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 6s 7ms/step - accuracy: 0.9909 - loss: 0.0310 - val_accurac
Epoch 4/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 8s 8ms/step - accuracy: 0.9926 - loss: 0.0262 - val_accurac
Epoch 5/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 10s 8ms/step - accuracy: 0.9954 - loss: 0.0174 - val_accura
Epoch 6/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 9s 7ms/step - accuracy: 0.9964 - loss: 0.0144 - val_accurac
Epoch 7/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 10s 7ms/step - accuracy: 0.9965 - loss: 0.0116 - val_accura
Epoch 8/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 8s 8ms/step - accuracy: 0.9974 - loss: 0.0097 - val_accurac
Epoch 9/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 9s 7ms/step - accuracy: 0.9971 - loss: 0.0102 - val_accurac
Epoch 10/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 10s 7ms/step - accuracy: 0.9972 - loss: 0.0086 - val_accura
Epoch 1/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 9s 8ms/step - accuracy: 0.8778 - loss: 0.4218 - val_accurac
Epoch 2/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 11s 9ms/step - accuracy: 0.9716 - loss: 0.0945 - val_accura
Epoch 3/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 6s 7ms/step - accuracy: 0.9804 - loss: 0.0618 - val_accurac
Epoch 4/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 8s 8ms/step - accuracy: 0.9863 - loss: 0.0447 - val_accurac
Epoch 5/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 9s 7ms/step - accuracy: 0.9887 - loss: 0.0350 - val_accurac
Epoch 6/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 10s 7ms/step - accuracy: 0.9907 - loss: 0.0289 - val_accura
Epoch 7/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 10s 7ms/step - accuracy: 0.9926 - loss: 0.0243 - val_accura
Epoch 8/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 8s 9ms/step - accuracy: 0.9935 - loss: 0.0203 - val_accurac
Epoch 9/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 8s 7ms/step - accuracy: 0.9945 - loss: 0.0150 - val_accurac
Epoch 10/10
938/938 ━━━━━━━━━━━━━━━━━━━━ 12s 8ms/step - accuracy: 0.9950 - loss: 0.0155 - val_accura
Performance Comparison

Plot Accuracy and Loss

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(history1.history['accuracy'], label='Model 1 Accuracy')
plt.plot(history2.history['accuracy'], label='Model 2 Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history1.history['loss'], label='Model 1 Loss')
plt.plot(history2.history['loss'], label='Model 2 Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()

plt.show()

Evaluation
loss1, acc1 = model1.evaluate(x_test, y_test, verbose=0)
loss2, acc2 = model2.evaluate(x_test, y_test, verbose=0)

print(f"Model 1 - Sigmoid: Accuracy = {acc1:.4f}, Loss = {loss1:.4f}")

print(f"Model 2 - ReLU: Accuracy = {acc2:.4f}, Loss = {loss2:.4f}")

Model 1 - Sigmoid: Accuracy = 0.9795, Loss = 0.0784

Model 2 - ReLU: Accuracy = 0.9818, Loss = 0.0757
Conclusion

• Model 1 (Sigmoid) tends to suffer from the vanishing gradient problem, leading to slower
convergence.

• Model 2 (ReLU) performs better due to its ability to mitigate the vanishing gradient issue,
allowing deeper networks to learn efficiently.

• The results indicate that using ReLU in hidden layers improves model performance compared to
sigmoid activation.

Neural Networks For Optimization and Signal Processing Cichocki Unbehauen PDF
100% (2)
Neural Networks For Optimization and Signal Processing Cichocki Unbehauen PDF
548 pages
Lab 12
No ratings yet
Lab 12
6 pages
NN From Scratch PDF 1735495327
No ratings yet
NN From Scratch PDF 1735495327
19 pages
DL Lab Experiments 2
No ratings yet
DL Lab Experiments 2
12 pages
Group_8_Practical
No ratings yet
Group_8_Practical
8 pages
R Deep Neural Network Step by Step
No ratings yet
R Deep Neural Network Step by Step
27 pages
Lab 12
No ratings yet
Lab 12
3 pages
GK Deeplearning
No ratings yet
GK Deeplearning
15 pages
DL LAB MANUAL
No ratings yet
DL LAB MANUAL
44 pages
Question Example
No ratings yet
Question Example
10 pages
Built An AI Based Forecasting Model For Intraday Trading 1713981234
No ratings yet
Built An AI Based Forecasting Model For Intraday Trading 1713981234
4 pages
Experiment 2.4 DL
No ratings yet
Experiment 2.4 DL
4 pages
Backpropagation
No ratings yet
Backpropagation
12 pages
DL Lab Manual
No ratings yet
DL Lab Manual
52 pages
NNDL Record Final
No ratings yet
NNDL Record Final
46 pages
Deep Learning
No ratings yet
Deep Learning
46 pages
lab 8
No ratings yet
lab 8
10 pages
Deep Learning lab with Tensorflow (2)
No ratings yet
Deep Learning lab with Tensorflow (2)
84 pages
ML Exp 4
No ratings yet
ML Exp 4
7 pages
DL_EXP-2_16010422230
No ratings yet
DL_EXP-2_16010422230
6 pages
E9 205 - Machine Learning For Signal Processing
No ratings yet
E9 205 - Machine Learning For Signal Processing
2 pages
cs519 hw2
No ratings yet
cs519 hw2
15 pages
dlweek6
No ratings yet
dlweek6
4 pages
Deep Learning LAB
No ratings yet
Deep Learning LAB
47 pages
keras
No ratings yet
keras
4 pages
Deep Learning
No ratings yet
Deep Learning
30 pages
ML Assignment-9
No ratings yet
ML Assignment-9
4 pages
DL2 - Jupyter Notebook
No ratings yet
DL2 - Jupyter Notebook
5 pages
DL-8
No ratings yet
DL-8
4 pages
deeplg3
No ratings yet
deeplg3
8 pages
# !pip Install Keras Tensorflow - U
No ratings yet
# !pip Install Keras Tensorflow - U
24 pages
Neural Network
No ratings yet
Neural Network
10 pages
NNDL_RECORD_MANUAL
No ratings yet
NNDL_RECORD_MANUAL
36 pages
DL Practical
No ratings yet
DL Practical
23 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Pdf
No ratings yet
Pdf
41 pages
Regularization For Neural Network
No ratings yet
Regularization For Neural Network
37 pages
Deep Learning Lab Practicals
No ratings yet
Deep Learning Lab Practicals
24 pages
DL
No ratings yet
DL
12 pages
NN & DL Lab Manual 1[1]
No ratings yet
NN & DL Lab Manual 1[1]
44 pages
6 Neural Network
No ratings yet
6 Neural Network
4 pages
Assignment 7 ML
No ratings yet
Assignment 7 ML
20 pages
Deep Record
No ratings yet
Deep Record
44 pages
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
No ratings yet
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
9 pages
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
No ratings yet
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
9 pages
Project Report: CS 574 - Computer Vision Using Machine Learning
No ratings yet
Project Report: CS 574 - Computer Vision Using Machine Learning
38 pages
NN From Scratch
No ratings yet
NN From Scratch
5 pages
Backpropagation.docx
No ratings yet
Backpropagation.docx
3 pages
ML Assignment 3
No ratings yet
ML Assignment 3
11 pages
AI lab 8
No ratings yet
AI lab 8
14 pages
new exp (1)
No ratings yet
new exp (1)
12 pages
Python Deep Learning Lab Programs (2)
No ratings yet
Python Deep Learning Lab Programs (2)
35 pages
CCS355-Neural networks and deep learning_____Assignment 1
No ratings yet
CCS355-Neural networks and deep learning_____Assignment 1
15 pages
Back Propagation Algorithm
No ratings yet
Back Propagation Algorithm
4 pages
ML 2.4 Prashant
No ratings yet
ML 2.4 Prashant
3 pages
Assignment3
No ratings yet
Assignment3
6 pages
TA1_U1_DL
No ratings yet
TA1_U1_DL
3 pages
Exp 3
No ratings yet
Exp 3
9 pages
Project 1 - ANN With Backprop
No ratings yet
Project 1 - ANN With Backprop
3 pages
Assignment 2 Dl
No ratings yet
Assignment 2 Dl
10 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
ANN Lab Syllabus
No ratings yet
ANN Lab Syllabus
2 pages
Lecture-20 21 22 (ANN)
No ratings yet
Lecture-20 21 22 (ANN)
30 pages
Lecture 3 - MATLAB Representation of Neural Network
No ratings yet
Lecture 3 - MATLAB Representation of Neural Network
6 pages
Assignment Questions 2
No ratings yet
Assignment Questions 2
2 pages
Residual Attention Network For Image Classification
No ratings yet
Residual Attention Network For Image Classification
9 pages
Unit 4 LSTM
No ratings yet
Unit 4 LSTM
85 pages
Few Selected Questions of Neural Network
No ratings yet
Few Selected Questions of Neural Network
3 pages
Question QUIZ MID 2
No ratings yet
Question QUIZ MID 2
6 pages
Deep Learning: Data Mining: Advanced Aspects
No ratings yet
Deep Learning: Data Mining: Advanced Aspects
131 pages
XOR Problem Demonstration Using MATLAB
0% (1)
XOR Problem Demonstration Using MATLAB
19 pages
CNN Architectures: Lenet, Alexnet, VGG, Googlenet, Resnet and More
No ratings yet
CNN Architectures: Lenet, Alexnet, VGG, Googlenet, Resnet and More
9 pages
Neural Language Model, RNNS: Pawan Goyal
No ratings yet
Neural Language Model, RNNS: Pawan Goyal
15 pages
Lec3 MLP Optimization
No ratings yet
Lec3 MLP Optimization
86 pages
Deep Learning
No ratings yet
Deep Learning
2 pages
DL Unit 1
No ratings yet
DL Unit 1
200 pages
8 CNN Example
No ratings yet
8 CNN Example
33 pages
CNN Mind Map by BSN
No ratings yet
CNN Mind Map by BSN
1 page
CNN RNN LSTM GRU Simple
100% (3)
CNN RNN LSTM GRU Simple
20 pages
unit 5
No ratings yet
unit 5
46 pages
Deep Belief Network
No ratings yet
Deep Belief Network
4 pages
Keras Cheat Sheet Python
No ratings yet
Keras Cheat Sheet Python
1 page
Cs490 Advanced Topics in Computing (Deep Learning) : Lecture 16: Convolutional Neural Networks (CNNS)
No ratings yet
Cs490 Advanced Topics in Computing (Deep Learning) : Lecture 16: Convolutional Neural Networks (CNNS)
63 pages
Multilayer Perceptron and Uppercase Handwritten Characters Recognition
No ratings yet
Multilayer Perceptron and Uppercase Handwritten Characters Recognition
4 pages
Atal - Gan
No ratings yet
Atal - Gan
67 pages
Aneja Convolutional Image Captioning CVPR 2018 Paper
No ratings yet
Aneja Convolutional Image Captioning CVPR 2018 Paper
10 pages
4 4 Choosing The Right Activation Function For Neural Networks
No ratings yet
4 4 Choosing The Right Activation Function For Neural Networks
25 pages
Deep Learning Syllabus
No ratings yet
Deep Learning Syllabus
2 pages
Generative AI Research Papers
No ratings yet
Generative AI Research Papers
3 pages
Lab2 - Perceptron and Adaline Networks
No ratings yet
Lab2 - Perceptron and Adaline Networks
7 pages