DeepLearningLab2.Ipynb - Colab
DeepLearningLab2.Ipynb - Colab
ipynb - Colab
keyboard_arrow_down Experiment 2
Deploy the Confusion matrix and simulate for Overfitting
Objective:
1.To understand and implement the Confusion Matrix as a performance evaluation metric for
classification models.
Prerequisites:
Key Terms:
Experimental Setup:
1.Dataset: Choose a suitable dataset for classification (e.g., Iris, Breast Cancer, or a synthetic
dataset). If using a real-world dataset, ensure proper data preprocessing (handling missing
values, encoding categorical features).
2.Model Selection: Select a classification algorithm (e.g., Logistic Regression, Decision Tree,
Random Forest, Support Vector Machine).
3.Train-Test Split: Divide the dataset into training and testing sets (e.g., 80% for training, 20% for
testing).
https://colab.research.google.com/drive/1msqPjYlywX54WK-50VU6J7bddOsjXsci#printMode=true 1/7
4/22/25, 8:52 PM DeepLearningLab2.ipynb - Colab
5.Prediction: Use the trained model to make predictions on the testing data.
6.Confusion Matrix Calculation: Calculate the confusion matrix based on the actual and
predicted labels.
7.Performance Metrics Calculation: Calculate accuracy, precision, recall, and F1-score from the
confusion matrix.
8.Overfitting Simulation: Train the model with increasing model complexity (e.g., increasing tree
depth in a decision tree, using a polynomial kernel of higher degree in SVM) or by training on a
smaller subset of the training data. Observe the performance on both training and testing sets.
Theory:
The confusion matrix provides a detailed breakdown of the model's performance beyond simple
accuracy. Understanding the different components (TP, TN, FP, FN) is crucial for evaluating the
model's strengths and weaknesses, especially in imbalanced datasets. Overfitting occurs when
a model becomes too complex and starts memorizing the training data, leading to poor
generalization to new data. This is often indicated by high accuracy on the training set but low
accuracy on the testing set.
Experimental Procedure:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
https://colab.research.google.com/drive/1msqPjYlywX54WK-50VU6J7bddOsjXsci#printMode=true 2/7
4/22/25, 8:52 PM DeepLearningLab2.ipynb - Colab
model = Sequential([
Flatten(input_shape=(32,32,3)),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: Use
super().__init__(**kwargs)
Epoch 1/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 14s 8ms/step - accuracy: 0.2936 - loss: 1.9858 - val_a
Epoch 2/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 13s 8ms/step - accuracy: 0.3719 - loss: 1.7501 - val_a
Epoch 3/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 21s 8ms/step - accuracy: 0.3964 - loss: 1.6945 - val_a
Epoch 4/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 20s 8ms/step - accuracy: 0.4142 - loss: 1.6468 - val_a
Epoch 5/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 12s 8ms/step - accuracy: 0.4188 - loss: 1.6235 - val_a
Epoch 6/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 22s 8ms/step - accuracy: 0.4286 - loss: 1.6072 - val_a
Epoch 7/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 21s 9ms/step - accuracy: 0.4331 - loss: 1.5939 - val_a
Epoch 8/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 13s 8ms/step - accuracy: 0.4404 - loss: 1.5734 - val_a
Epoch 9/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 13s 8ms/step - accuracy: 0.4396 - loss: 1.5682 - val_a
Epoch 10/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 14s 9ms/step - accuracy: 0.4414 - loss: 1.5599 - val_a
Epoch 1/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 13s 8ms/step - accuracy: 0.4951 - loss: 1.4221 - va
Epoch 2/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 20s 8ms/step - accuracy: 0.4984 - loss: 1.4052 - va
Epoch 3/50
https://colab.research.google.com/drive/1msqPjYlywX54WK-50VU6J7bddOsjXsci#printMode=true 3/7
4/22/25, 8:52 PM DeepLearningLab2.ipynb - Colab
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 20s 8ms/step - accuracy: 0.4942 - loss: 1.4143 - va
Epoch 4/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 20s 8ms/step - accuracy: 0.4988 - loss: 1.4059 - va
Epoch 5/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 24s 10ms/step - accuracy: 0.4914 - loss: 1.4096 - v
Epoch 6/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 12s 8ms/step - accuracy: 0.5000 - loss: 1.4002 - va
Epoch 7/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 21s 8ms/step - accuracy: 0.4972 - loss: 1.4054 - va
Epoch 8/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 20s 8ms/step - accuracy: 0.4983 - loss: 1.4019 - va
Epoch 9/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 12s 8ms/step - accuracy: 0.5004 - loss: 1.3984 - va
Epoch 10/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 12s 8ms/step - accuracy: 0.5004 - loss: 1.3976 - va
Epoch 11/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 14s 9ms/step - accuracy: 0.4976 - loss: 1.4114 - va
Epoch 12/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 19s 8ms/step - accuracy: 0.4991 - loss: 1.4040 - va
Epoch 13/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 12s 8ms/step - accuracy: 0.5010 - loss: 1.3918 - va
Epoch 14/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 20s 7ms/step - accuracy: 0.5021 - loss: 1.3986 - va
Epoch 15/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 12s 8ms/step - accuracy: 0.5008 - loss: 1.3991 - va
Epoch 16/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 21s 8ms/step - accuracy: 0.5011 - loss: 1.3922 - va
Epoch 17/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 13s 8ms/step - accuracy: 0.4981 - loss: 1.4032 - va
Epoch 18/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 19s 8ms/step - accuracy: 0.5066 - loss: 1.3870 - va
Epoch 19/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 21s 8ms/step - accuracy: 0.4995 - loss: 1.3968 - va
Epoch 20/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 13s 8ms/step - accuracy: 0.5004 - loss: 1.3967 - va
Epoch 21/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 20s 8ms/step - accuracy: 0.5016 - loss: 1.3896 - va
Epoch 22/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 22s 9ms/step - accuracy: 0.5042 - loss: 1.3908 - va
Epoch 23/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 14s 9ms/step - accuracy: 0.4991 - loss: 1.3968 - va
Epoch 24/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 19s 8ms/step - accuracy: 0.5009 - loss: 1.3936 - va
Epoch 25/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 12s 8ms/step - accuracy: 0.5007 - loss: 1.3906 - va
Epoch 26/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 21s 8ms/step - accuracy: 0.5046 - loss: 1.3803 - va
Epoch 27/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 13s 8ms/step - accuracy: 0.5050 - loss: 1.3869 - va
Epoch 28/50
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 20s 8ms/step - accuracy: 0.5027 - loss: 1.3956 - va
Epoch 29/50
loss,accuracy = model.evaluate(x_test, y_test, verbose=0)
print('Test Loss:', loss)
https://colab.research.google.com/drive/1msqPjYlywX54WK-50VU6J7bddOsjXsci#printMode=true 4/7
4/22/25, 8:52 PM DeepLearningLab2.ipynb - Colab
y_pred = model.predict(x_test)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true_classes = np.argmax(y_test, axis=1)
cm = confusion_matrix(y_true_classes, y_pred_classes)
print("Confusion Matrix: ",cm)
plt.plot(history_overfit.history['accuracy'])
plt.plot(history_overfit.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
https://colab.research.google.com/drive/1msqPjYlywX54WK-50VU6J7bddOsjXsci#printMode=true 5/7
4/22/25, 8:52 PM DeepLearningLab2.ipynb - Colab
Data Leakage: Ensure that information from the testing set does not leak into the training
set.
Imbalanced Datasets: If the dataset has a skewed class distribution, consider using
appropriate evaluation metrics (e.g., precision, recall, F1-score) and techniques (e.g.,
oversampling, undersampling).
Random Initialization: Be mindful of the random initialization of model parameters and use
a fixed random seed for reproducibility.
Result:
3.Plots showing the training and testing accuracy/loss as a function of model complexity or
training data size, demonstrating the effects of overfitting.
Short Questions:
https://colab.research.google.com/drive/1msqPjYlywX54WK-50VU6J7bddOsjXsci#printMode=true 6/7
4/22/25, 8:52 PM DeepLearningLab2.ipynb - Colab
1.What are the advantages of using a confusion matrix over simple accuracy?
3.How does overfitting affect the model's performance on training and testing data?
5.How can you interpret the values in a confusion matrix to understand the model's behavior?
https://colab.research.google.com/drive/1msqPjYlywX54WK-50VU6J7bddOsjXsci#printMode=true 7/7