Experiment No.
Aim: Design the architecture and implement the autoencoder model for Image Compression.
Introduction
Image compression aims to reduce the size of image files while preserving important information to
enable efficient storage and transmission. Traditional methods like JPEG use hand-crafted
transforms (DCT, quantization), but with deep learning, autoencoders have emerged as powerful
tools for learned compression.
An autoencoder is a type of neural network designed to learn efficient data representations in an
unsupervised manner. It consists of two main parts:
• Encoder: Compresses the input image into a lower-dimensional latent representation (code).
• Decoder: Reconstructs the original image from this compressed code.
By training the network to minimize the difference between the original and reconstructed images,
the autoencoder learns how to compress and decompress data efficiently.
Autoencoder Architecture
Key Points:
• The encoder and decoder are typically mirror images in terms of architecture.
• The latent vector size controls the compression ratio: smaller latent space means more
compression but potentially lower reconstruction quality.
• Use of activation functions like ReLU in intermediate layers and sigmoid or tanh in the
output layer to scale pixel values
• The autoencoder learns to encode images into a compressed form by minimizing a
reconstruction loss, such as Mean Squared Error (MSE) between the input image x and the
output reconstruction x^:
L=N1i=1∑N∥xi−x^i∥2
Advantages of Autoencoders for Image Compression
• Learned Compression: Unlike traditional fixed algorithms, autoencoders learn
compression schemes directly from data, potentially achieving better compression for
specific image types.
• Non-linear Feature Extraction: Can capture complex patterns and structures beyond linear
transforms.
• End-to-End Training: The entire compression-decompression pipeline is trained jointly to
optimize reconstruction quality.
• Adaptability: Can be fine-tuned for different datasets and requirements.
Disadvantages and Challenges:
• Reconstruction Quality vs. Compression Tradeoff: Higher compression (smaller latent
space) often reduces image quality.
• Computational Cost: Training deep autoencoders requires substantial computation and
data.
• Overfitting Risk: Without enough data or regularization, models can overfit, losing
generalization.
• No Explicit Control Over Compression Ratio: Unlike traditional codecs with set bitrates,
controlling exact compression size is more challenging.
Mini-Batch Gradient Descent (MBGD)
Definition
Mini-Batch Gradient Descent is a hybrid approach that balances between Stochastic Gradient
Descent and full-batch training. Instead of processing one example at a time or the entire dataset,
MBGD splits the training data into small chunks (mini-batches) and computes the gradient for each
[Link] method provides more accurate gradient estimates than SGD while being more efficient
than full-batch training.
Advantages:
• Smoother and more stable learning than SGD
• Compatible with parallel and GPU processing
• Faster convergence due to more accurate gradients
Disadvantages:
• Requires careful tuning of mini-batch size
• Consumes more memory than SGD
Code:
def mini_batch_gradient_descent(X, y_true, epochs=100, batch_size=5, learning_rate=0.01):
number_of_features = [Link][1]
w = [Link](shape=(number_of_features)) # Initialize weights [cite: 196]
b = 0 # Initialize bias [cite: 197]
total_samples = [Link][0]
if batch_size > total_samples:
batch_size = total_samples
cost_list = []
epoch_list = []
num_batches = int(total_samples / batch_size)
for i in range(epochs):
# Shuffle the data for each epoch [cite: 205]
random_indices = [Link](total_samples)
X_tmp = X[random_indices]
y_tmp = y_true[random_indices]
for j in range(0, total_samples, batch_size):
# Get the mini-batch [cite: 209, 210]
Xj = X_tmp[j:j+batch_size]
yj = y_tmp[j:j+batch_size]
# Predict output and calculate gradients for the batch [cite: 212, 213, 214]
y_predicted = [Link](w, Xj.T) + b
w_grad = -(2/len(Xj)) * ([Link](yj - y_predicted))
b_grad = -(2/len(Xj)) * [Link](yj - y_predicted)
# Update weights and bias [cite: 215, 216]
w = w - learning_rate * w_grad
b = b - learning_rate * b_grad
# Calculate and record the cost [cite: 217, 218, 219, 220]
cost = [Link]([Link](yj - y_predicted))
if i % 10 == 0:
cost_list.append(cost)
epoch_list.append(i)
return w, b, cost, cost_list, epoch_list
# Execute the MBGD function
w_mbgd, b_mbgd, cost_mbgd, cost_list_mbgd, epoch_list_mbgd = mini_batch_gradient_descent(
scaled_X,
scaled_y.reshape(scaled_y.shape[0],),
epochs=120,
batch_size=5
)
print("\nMBGD Learned Weights:", w_mbgd)
print("MBGD Learned Bias:", b_mbgd)
# Plotting the cost over epochs for MBGD
[Link](figsize=(10, 6))
[Link]("Epoch")
[Link]("Cost")
[Link]("Mini-Batch Gradient Descent: Cost vs. Epoch")
[Link](epoch_list_mbgd, cost_list_mbgd)
[Link]()
def stochastic_gradient_descent(X, y_true, epochs, learning_rate=0.01):
import random
number_of_features = [Link][1]
w = [Link](shape=(number_of_features)) # Initialize weights [cite: 142]
b = 0 # Initialize bias [cite: 143]
total_samples = [Link][0]
cost_list = []
epoch_list = []
for i in range(epochs):
# Select a random sample from the dataset [cite: 148]
random_index = [Link](0, total_samples - 1)
sample_x = X[random_index]
sample_y = y_true[random_index]
# Predict the output and calculate gradients [cite: 151, 152, 153]
y_predicted = [Link](w, sample_x.T) + b
w_grad = -(2/total_samples) * (sample_x.[Link](sample_y - y_predicted))
b_grad = -(2/total_samples) * (sample_y - y_predicted)
# Update weights and bias [cite: 154, 155]
w = w - learning_rate * w_grad
b = b - learning_rate * b_grad
# Calculate and record the cost [cite: 157, 158, 159, 160]
cost = [Link](sample_y - y_predicted)
if i % 100 == 0:
cost_list.append(cost)
epoch_list.append(i)
return w, b, cost, cost_list, epoch_list
# Execute the SGD function
w_sgd, b_sgd, cost_sgd, cost_list_sgd, epoch_list_sgd = stochastic_gradient_descent(
scaled_X,
scaled_y.reshape(scaled_y.shape[0],),
10000
)
print("SGD Learned Weights:", w_sgd)
print("SGD Learned Bias:", b_sgd)
# Plotting the cost over epochs for SGD
[Link](figsize=(10, 6))
[Link]("Epoch")
[Link]("Cost")
[Link]("Stochastic Gradient Descent: Cost vs. Epoch")
[Link](epoch_list_sgd, cost_list_sgd)
[Link]()
Conclusion : Hence we have successfully implemented this practical.