dlweek6
dlweek6
No : Date :
Experiment No-6:
AIM:
To perform regularization on a dataset with different hyperparameter values and identifying the
best model
DESCRIPTION:
Overfitting occurs when a model learns not only the underlying patterns in the training data but
also the noise, leading to poor performance on unseen data. Regularization is crucial for
enhancing the generalization capabilities of machine learning models, allowing them to
perform well on new, unseen datasets.
What is Regularization?
Regularization introduces a penalty term to the loss function used during model training. This
penalty discourages overly complex models by constraining the model's parameters, thus
controlling their ability to fit the training data. The primary goal is to improve the model’s
ability to generalize beyond the training set, enhancing its performance on unseen data.
Common regularization techniques in deep learning include:
1. L1 Regularization (Lasso Regression):
o This technique adds a penalty equivalent to the absolute value of the magnitude
of coefficients to the loss function. The formulation of the regularized cost
function becomes: Cost function=Loss+λ∑∣wi∣
o L1 regularization tends to produce sparse weight matrices, meaning it can
effectively reduce some weights to zero. This property is beneficial for feature
selection, as it can eliminate unnecessary features from the model.
2. L2 Regularization (Ridge Regression):
o L2 regularization adds a penalty equal to the square of the magnitude of
coefficients to the loss function: Cost function= Cost function=Loss+λ∑wi2
o This technique discourages large weights but does not necessarily drive them to
zero. It is often preferred over L1 regularization because it generally leads to
better generalization.
3. Dropout:
o Dropout is a regularization technique that randomly sets a fraction of input units
to zero during training. This prevents neurons from co-adapting too much. It can
be applied after layers during training and helps to create more robust features
by encouraging redundancy in the network.
4. Early Stopping:
o This method involves monitoring the model’s performance on a validation set
during training and stopping the training process once the performance ceases to
improve. Early stopping helps to avoid overfitting by preventing the model from
training too long.
Implementation Plan
1. Dataset Preparation: We will use a well-known dataset, such as MNIST or CIFAR-10,
to facilitate training and evaluation.
2. Model Design: A neural network model will be designed with the flexibility to
incorporate different regularization techniques. Layers will include activation functions,
dropout, and the regularization parameters.
3. Training the Model: The models will be trained with varying hyperparameters for the
regularization methods. For example:
CODE:
import numpy as np
import pandas as pd
# from scipy.misc import imread
import imageio
from sklearn.metrics import accuracy_score
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import Adam, SGD, RMSprop
from keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
# Load and preprocess the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(-1, 784).astype('float32') / 255.0
x_test = x_test.reshape(-1, 784).astype('float32') / 255.0
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
epochs=params['epochs'],
batch_size=params['batch_size'],
validation_data=(x_test, y_test),
verbose=0)
# Evaluate the model
score = model.evaluate(x_test, y_test, verbose=0)
print(f"Test accuracy for params {params}: {score[1]:.4f}")
results.append({
'epochs': params['epochs'],
'batch_size': params['batch_size'],
'optimizer': params['optimizer'],
'dropout': params['dropout'],
'hidden_layers': params['hidden_layers'],
'test_accuracy': score[1]
})
# Convert results to DataFrame and display
results_df = pd.DataFrame(results)
print("\nResults summary:")
print(results_df)
OUTPUT:
Training with params: {'epochs': 10, 'batch_size': 128, 'optimizer': 'adam', 'dropout': 0.2,
'hidden_layers': 3}
Test accuracy for params {'epochs': 10, 'batch_size': 128, 'optimizer': 'adam', 'dropout': 0.2,
'hidden_layers': 3}: 0.9788
Training with params: {'epochs': 20, 'batch_size': 64, 'optimizer': 'adam', 'dropout': 0.3,
'hidden_layers': 4}
Test accuracy for params {'epochs': 20, 'batch_size': 64, 'optimizer': 'adam', 'dropout': 0.3,
'hidden_layers': 4}: 0.9839
Training with params: {'epochs': 15, 'batch_size': 256, 'optimizer': 'rmsprop', 'dropout': 0.4,
'hidden_layers': 5}
Test accuracy for params {'epochs': 15, 'batch_size': 256, 'optimizer': 'rmsprop', 'dropout': 0.4,
'hidden_layers': 5}: 0.9823
Training with params: {'epochs': 12, 'batch_size': 128, 'optimizer': 'sgd', 'dropout': 0.3,
'hidden_layers': 4}
Test accuracy for params {'epochs': 12, 'batch_size': 128, 'optimizer': 'sgd', 'dropout': 0.3,
'hidden_layers': 4}: 0.9522
Training with params: {'epochs': 10, 'batch_size': 128, 'optimizer': 'adam', 'dropout': 0.5,
'hidden_layers': 5}
Test accuracy for params {'epochs': 10, 'batch_size': 128, 'optimizer': 'adam', 'dropout': 0.5,
'hidden_layers': 5}: 0.9773