Open In App

Mastering Indian Cuisine Classification Using Deep Learning: A Comprehensive Guide

Last Updated : 22 Aug, 2024
Summarize
Comments
Improve
Suggest changes
Share
Like Article
Like
Report

The world of artificial intelligence (AI) and deep learning is advancing remarkably, offering new ways to approach problems that were once considered too complex. One fascinating application is in the domain of food classification, specifically in identifying various dishes from India's diverse culinary landscape.

In this article, we'll walk through creating a deep learning model to classify Indian cuisine, using advanced techniques such as transfer learning, fine-tuning, and hyperparameter optimization.


Why to classify Indian Cuisine?

India's rich culinary heritage and its myriad of regional dishes present a unique challenge for classification models. The diversity in ingredients, cooking methods, and presentation styles across different regions such as North, South, East, and West India makes this task both challenging and intriguing. Building a model that can accurately classify these dishes is not just an exercise in AI but a celebration of India's gastronomic diversity.

Indian Cuisine Classification Using Deep Learning

In this section, we are going to perform the classification using the Indian Cuisine Dataset. Let's break down the code into steps, provide an explanation for each step, and include the respective code snippets.

Step 1: Mount Google Drive

This step is necessary if you're using Google Colab to access files stored on Google Drive.

Python
from google.colab import drive
drive.mount('/content/drive')


Step 2: Removing Corrupted Images

This step verifies the images in the dataset to ensure they are not corrupted. If an image is found to be corrupted, it is removed from the dataset.

Python
import os
from PIL import Image

def check_images(directory):
    for root, _, files in os.walk(directory):
        for file in files:
            file_path = os.path.join(root, file)
            try:
                img = Image.open(file_path)
                img.verify()
            except (IOError, SyntaxError) as e:
                print(f"Bad file: {file_path}")
                os.remove(file_path)

dataset_path = '/content/drive/MyDrive/food_dataset'
check_images(dataset_path)


Step 3: Data Preprocessing

This step involves setting up the data generators that will be used to feed images into the model during training. The images are rescaled, and the data is split into training and validation sets.

Python
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    validation_split=0.3
)

train_gen = train_datagen.flow_from_directory(
    dataset_path,
    target_size=(224, 224),
    class_mode='sparse',
    batch_size=32,
    subset='training',
    shuffle=True
)

validation_gen = train_datagen.flow_from_directory(
    dataset_path,
    target_size=(224, 224),
    class_mode='sparse',
    batch_size=32,
    subset='validation',
    shuffle=True
)

print(train_gen.class_indices)
print(validation_gen.class_indices)

Output:

{'Aloo Puri': 0, 'Bhindi Masala': 1, 'Chhole Bhature': 2, 'Dal Bati Churma': 3, 'Dal Makhni': 4, 'Dhokla': 5, 'Gulab Jamun': 6, 'Idli Sambhar': 7, 'Jalebi': 8, 'Kheer': 9, 'Mushroom': 10, 'Paneer': 11, 'Pav Bhaji': 12, 'Poha': 13, 'Rajma Chawal': 14, 'Rasgulla': 15, 'Rasmalai': 16, 'Sarson ka Saag Makki ki Roti': 17, 'Thepla': 18, 'Vada Pav': 19}
{'Aloo Puri': 0, 'Bhindi Masala': 1, 'Chhole Bhature': 2, 'Dal Bati Churma': 3, 'Dal Makhni': 4, 'Dhokla': 5, 'Gulab Jamun': 6, 'Idli Sambhar': 7, 'Jalebi': 8, 'Kheer': 9, 'Mushroom': 10, 'Paneer': 11, 'Pav Bhaji': 12, 'Poha': 13, 'Rajma Chawal': 14, 'Rasgulla': 15, 'Rasmalai': 16, 'Sarson ka Saag Makki ki Roti': 17, 'Thepla': 18, 'Vada Pav': 19}

Step 4: Model Creation with Transfer Learning

In this step, a pre-trained model (MobileNetV2) is used as the base model. The layers of this model are frozen initially to retain the learned features. A custom classification head is added to the model for the specific task of classifying 20 different classes of food.

Python
from tensorflow.keras import layers, models
from tensorflow.keras.applications import MobileNetV2

reference_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

# Freezing the layers
for layer in reference_model.layers:
    layer.trainable = False

model = models.Sequential([
    reference_model,
    layers.Flatten(),
    layers.BatchNormalization(),
    layers.Dense(units=128, activation='relu'),
    layers.BatchNormalization(),
    layers.Dropout(0.2),
    layers.Dense(units=20, activation='softmax')
])


Step 5: Model Compilation and Training

The model is compiled with an Adam optimizer, a loss function for categorical classification, and accuracy as the evaluation metric. Early stopping is used to prevent overfitting. Then, the model is trained on the training data and validated on the validation data.

Python
import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint

early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

checkpoint_callback = ModelCheckpoint(
    filepath='/content/drive/MyDrive/saved_model/model_epoch_{epoch:02d}.h5',
    monitor='val_accuracy',
    save_best_only=True,
    save_freq='epoch',
)

trained_model = model.fit(
    train_gen,
    epochs=10,
    validation_data=validation_gen,
    callbacks=[checkpoint_callback]
)

Output:

Epoch 1/10
30/85 [=========>....................] - ETA: 30s - loss: 1.6956 - accuracy: 0.5208/usr/local/lib/python3.10/dist-packages/PIL/Image.py:996: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images
warnings.warn(
85/85 [==============================] - 83s 936ms/step - loss: 1.2460 - accuracy: 0.6422 - val_loss: 1.0757 - val_accuracy: 0.7273
Epoch 2/10
85/85 [==============================] - 69s 816ms/step - loss: 0.1729 - accuracy: 0.9774 - val_loss: 0.8590 - val_accuracy: 0.7552
Epoch 3/10
85/85 [==============================] - 69s 820ms/step - loss: 0.0577 - accuracy: 0.9978 - val_loss: 0.8410 - val_accuracy: 0.7552
Epoch 4/10
85/85 [==============================] - 70s 822ms/step - loss: 0.0266 - accuracy: 0.9993 - val_loss: 0.8282 - val_accuracy: 0.7657
Epoch 5/10
85/85 [==============================] - 68s 806ms/step - loss: 0.0185 - accuracy: 0.9996 - val_loss: 0.8367 - val_accuracy: 0.7649
Epoch 6/10
85/85 [==============================] - 70s 825ms/step - loss: 0.0109 - accuracy: 0.9996 - val_loss: 0.8370 - val_accuracy: 0.7649
Epoch 7/10
85/85 [==============================] - 70s 827ms/step - loss: 0.0106 - accuracy: 0.9996 - val_loss: 0.8666 - val_accuracy: 0.7605
Epoch 8/10
85/85 [==============================] - 70s 828ms/step - loss: 0.0095 - accuracy: 0.9996 - val_loss: 0.8606 - val_accuracy: 0.7561
Epoch 9/10
85/85 [==============================] - 68s 800ms/step - loss: 0.0052 - accuracy: 1.0000 - val_loss: 0.8555 - val_accuracy: 0.7570
Epoch 10/10
85/85 [==============================] - 68s 808ms/step - loss: 0.0072 - accuracy: 1.0000 - val_loss: 0.8684 - val_accuracy: 0.7579

Step 6: Fine-tuning the Model

After initial training, some or all layers of the base model are unfrozen, and the model is recompiled with a lower learning rate using the SGD optimizer. This process, called fine-tuning, helps improve the model's performance.

Python
# Unfreezing some layers
for layer in reference_model.layers:
    layer.trainable = True

model.compile(
    optimizer=tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.9),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

Step 7: Saving the Final Model

After training, the final model is saved to Google Drive for later use.

Python
saved_model = model.save('/content/drive/MyDrive/saved_model/final.keras')

Step 8: Loading the Model and Making Predictions

The saved model is loaded from Google Drive, and a function is defined to predict the class of a new image.

Python
from keras.models import load_model
from keras.preprocessing import image
import numpy as np

# Function to predict images
def predict_image(image_path, saved_model):
    img = image.load_img(image_path, target_size=(224, 224))
    img = image.img_to_array(img)
    img = img / 255.0
    img = np.expand_dims(img, axis=0)

    prediction = saved_model.predict(img)
    classes = ['Aloo Puri', 'Bhindi Masala', 'Chhole Bhature', 'Dal Bati Churma', 'Dal Makhni', 'Dhokla', 'Gulab Jamun', 'Idli Sambhar', 'Jalebi', 'Kheer', 'Mushroom', 'Paneer', 'Pav Bhaji', 'Poha', 'Rajma Chawal', 'Rasgulla', 'Rasmalai', 'Sarson ka Saag Makki ki Roti', 'Thepla', 'Vada Pav']

    predicted_class = classes[np.argmax(prediction)]

    return predicted_class

saved_model = load_model('/content/drive/MyDrive/saved_model/final.keras')
image_path = '/content/drive/MyDrive/new.jfif'
predicted_class = predict_image(image_path, saved_model)
print(predicted_class)

Output:

Jalebi

Step 9: Model Evaluation

The model is evaluated on the validation dataset, and a classification report is generated to see how well the model performed.

Python
from sklearn.metrics import classification_report

y_predict = model.predict(validation_gen)
y_true = validation_gen.classes

print(classification_report(y_true, y_predict.argmax(axis=1)))

Output:

              precision    recall  f1-score   support

0 0.03 0.02 0.03 46
1 0.06 0.07 0.06 72
2 0.07 0.08 0.07 79
3 0.08 0.08 0.08 91
4 0.08 0.07 0.07 46
5 0.09 0.09 0.09 97
6 0.04 0.04 0.04 51
7 0.07 0.07 0.07 90
8 0.02 0.03 0.02 77
9 0.05 0.05 0.05 60
10 0.03 0.02 0.02 63
11 0.09 0.12 0.10 100
12 0.00 0.00 0.00 12
13 0.00 0.00 0.00 4
14 0.09 0.09 0.09 91
16 0.00 0.00 0.00 1
17 0.06 0.06 0.06 90
18 0.08 0.08 0.08 74

accuracy 0.07 1144
macro avg 0.05 0.05 0.05 1144
weighted avg 0.06 0.07 0.06 1144

Step 10: Plotting Training and Validation Loss/Accuracy

This step involves plotting the training and validation loss and accuracy over the epochs to visualize the model's performance.

Python
import matplotlib.pyplot as plt

training_loss = trained_model.history['loss']
test_loss = trained_model.history['val_loss']
training_acc = trained_model.history['accuracy']
test_acc = trained_model.history['val_accuracy']
epochs = range(len(training_loss))

plt.title('Loss vs Epoch')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.plot(epochs, training_loss, label='Training Loss')
plt.plot(epochs, test_loss, label='Validation Loss')
plt.legend()
plt.show()

Output:

lossvsepoch

Why MobileNetV2 Stood Out?

Among the three architectures tested—VGG16, ResNet50, and MobileNetV2—MobileNetV2 emerged as the best-performing model. Its efficient architecture, designed for resource-constrained environments, allowed it to achieve high accuracy while maintaining lower computational costs. This makes MobileNetV2 an excellent choice not only for large-scale classification tasks but also for deployment on devices with limited processing power.

Conclusion

The journey of building an Indian cuisine classification model is as much about understanding the intricacies of deep learning as it is about appreciating the diversity of Indian food. By employing advanced techniques like transfer learning, fine-tuning, and hyperparameter optimization, and by comparing different architectures, we were able to create a robust model capable of classifying a wide range of Indian dishes, with MobileNetV2 leading the way in performance.


Similar Reads