0% found this document useful (0 votes)
2 views

exp3

Experiment Deep learning

Uploaded by

Vijay Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

exp3

Experiment Deep learning

Uploaded by

Vijay Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Experiment 3

Aim: To study and implement Multi-Layer Perceptron

A Multi-Layer Perceptron (MLP) is a type of artificial neural network (ANN) consisting


of multiple layers of nodes (neurons) that are connected together. It is one of the most
fundamental and widely used architectures in machine learning for supervised learning
tasks, such as classification and regression. MLPs belong to a broader category of
networks called feedforward neural networks, where the data flows in one direction:
from the input layer through hidden layers to the output layer.

Here’s a detailed breakdown of the components and functioning of an MLP:

1. Architecture of an MLP

An MLP consists of three types of layers:

● Input layer: The first layer where the input features are fed into the network. The
number of nodes in this layer is equal to the number of input features. For
example, for an image input of size 28x28 pixels (like in MNIST), the input layer
will have 784 nodes (28x28=784).
● Hidden layers: One or more layers of neurons between the input and output
layers. These layers are where most of the computation happens. The number
of neurons in each hidden layer is a hyperparameter, and its selection often

Vinay Kumar Patel | 02920811922


depends on the problem complexity and dataset. Each neuron in a hidden layer
is connected to all neurons in the previous and next layer.
● Output layer: The final layer where the predictions are made. The number of
neurons in the output layer corresponds to the number of classes in
classification tasks or the number of regression outputs. For example, in a
binary classification task, the output layer will have 1 neuron (which will output a
probability).

2. Working of an MLP

a) Input to Hidden Layer (Forward Propagation)

Each neuron in the hidden layer receives inputs from all neurons in the previous layer
(input layer or another hidden layer), applies a weighted sum, and passes the result
through an activation function. Mathematically, this can be written as:

Where:

● xi are the inputs (features or outputs from the previous layer).


● wi are the weights associated with the inputs.
● b is the bias term.
● n is the number of inputs.

This weighted sum z is then passed through an activation function, such as ReLU
(Rectified Linear Unit), Sigmoid, or Tanh, to introduce non-linearity. The output of this
activation is the input to the next layer.

b) Activation Functions

The role of activation functions is crucial in allowing the network to learn complex
patterns. Without activation functions, the neural network would just be a linear function,
regardless of how many layers it has.

c) Output Layer

The output layer is where the final prediction is made. For classification, if the problem is
binary, the output can be a single neuron with a Sigmoid activation function to output a

Vinay Kumar Patel | 02920811922


probability. For multi-class classification, the Softmax function is used, where each
output neuron represents the probability of the corresponding class.
In a regression problem, the output layer may have one neuron with a linear activation
function (or no activation at all) to predict continuous values.

3. Training an MLP (Backpropagation and Gradient Descent)

The training of an MLP involves adjusting the weights and biases to minimize the error
between predicted and actual outputs. This is done through a process called
backpropagation, which uses gradient descent to optimize the model.

a) Forward Propagation

1. Pass the input through the network (from the input layer to the output layer) to
generate predictions.

b) Loss Calculation

1. Calculate the loss (or error) by comparing the network's predictions to the true
labels. Common loss functions include:
○ Cross-entropy loss (for classification tasks)
○ Mean squared error (MSE) (for regression tasks)

c) Backward Propagation (Backpropagation)

1. Backpropagate the error by computing the gradient of the loss with respect to
each weight and bias. This is done using the chain rule of calculus to compute
gradients layer by layer.

d) Update Weights (Gradient Descent)

1. Update the weights using gradient descent, which aims to reduce the loss by
adjusting the weights in the direction that minimizes the error. The update rule for
weights www and biases b is:

Vinay Kumar Patel | 02920811922


Dataset link: https://www.kaggle.com/datasets/bhumikatandon/diabetes-classification-
dataset/data
Code:

import numpy as np import pandas as pd from


sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score import
tensorflow as tf from tensorflow.keras.models
import Sequential from tensorflow.keras.layers
import Dense

# Load dataset import


kagglehub

!kaggle datasets download arshid/iris-flower-dataset data


= pd.read_csv("/content/iris-flower-dataset.zip")

X = data.iloc[:, :-1].values # Features y =


data.iloc[:, -1].values # Target labels

# Scale the features scaler


= StandardScaler()
X_scaled = scaler.fit_transform(X)

from sklearn.preprocessing import LabelEncoder


# Initialize the label encoder label_encoder
= LabelEncoder()
# Convert string labels into numeric labels
y_encoded = label_encoder.fit_transform(y)

X_train, X_test, y_train, y_test = train_test_split(X_scaled,


y_encoded, test_size=0.2, random_state=42)
# Define the MLP model
model = Sequential()
# Input layer (input shape is based on the number of features)
model.add(Dense(128, input_dim=X_train.shape[1],
activation='relu')) # Hidden layer model.add(Dense(64,
activation='relu'))
# Output layer (number of classes corresponds to the output size)

Vinay Kumar Patel | 02920811922


model.add(Dense(y_train.shape[1], activation='softmax')) #
Softmax for multi-class classification

# Compile the model model.compile(optimizer='adam',


loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model history = model.fit(X_train, y_train,


epochs=50, batch_size=32, validation_split=0.2, verbose=1)

# Evaluate the model on the test set

test_loss, test_accuracy = model.evaluate(X_test,


y_test) print(f"Test Loss: {test_loss}") print(f"Test
Accuracy: {test_accuracy}")

# Get the predicted probabilities y_pred_probabilities


= model.predict(X_test)

# Convert the predicted probabilities to class labels (argmax


gives the index of the highest probability) y_pred =
np.argmax(y_pred_probabilities, axis=1) from sklearn.metrics
import confusion_matrix import seaborn as sns import
matplotlib.pyplot as plt

# Convert y_test from one-hot encoding to class indices (numeric


labels) y_true_classes = np.argmax(y_test, axis=1)

# Confusion matrix cm =
confusion_matrix(y_true_classes, y_pred)

# Display the confusion matrix as a heatmap


plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues",
xticklabels=label_encoder.classes_,
yticklabels=label_encoder.classes_) plt.xlabel('Predicted
Labels') plt.ylabel('True Labels') plt.title('Confusion Matrix')
plt.show()

from sklearn.metrics import accuracy_score, precision_score,


recall_score, f1_score

Vinay Kumar Patel | 02920811922


# Accuracy accuracy =
accuracy_score(y_true_classes, y_pred)
print(f"Accuracy: {accuracy}")

# Precision, Recall, F1
precision=precision_score(y_true_classes,y_pred,average='macro')
# Macro averages across all classes recall =
recall_score(y_true_classes, y_pred, average='macro') f1 =
f1_score(y_true_classes, y_pred, average='macro')
print(f"Precision:
{precision}") print(f"Recall:
{recall}") print(f"F1 Score:
{f1}")

Evaluation Metrics:

Confusion Matrix:

Vinay Kumar Patel | 02920811922


Vinay Kumar Patel | 02920811922

You might also like