Mastering Indian Cuisine Classification Using Deep Learning: A Comprehensive Guide
Last Updated :
22 Aug, 2024
The world of artificial intelligence (AI) and deep learning is advancing remarkably, offering new ways to approach problems that were once considered too complex. One fascinating application is in the domain of food classification, specifically in identifying various dishes from India's diverse culinary landscape.
In this article, we'll walk through creating a deep learning model to classify Indian cuisine, using advanced techniques such as transfer learning, fine-tuning, and hyperparameter optimization.
Why to classify Indian Cuisine?
India's rich culinary heritage and its myriad of regional dishes present a unique challenge for classification models. The diversity in ingredients, cooking methods, and presentation styles across different regions such as North, South, East, and West India makes this task both challenging and intriguing. Building a model that can accurately classify these dishes is not just an exercise in AI but a celebration of India's gastronomic diversity.
Indian Cuisine Classification Using Deep Learning
In this section, we are going to perform the classification using the Indian Cuisine Dataset. Let's break down the code into steps, provide an explanation for each step, and include the respective code snippets.
Step 1: Mount Google Drive
This step is necessary if you're using Google Colab to access files stored on Google Drive.
Python
from google.colab import drive
drive.mount('/content/drive')
Step 2: Removing Corrupted Images
This step verifies the images in the dataset to ensure they are not corrupted. If an image is found to be corrupted, it is removed from the dataset.
Python
import os
from PIL import Image
def check_images(directory):
for root, _, files in os.walk(directory):
for file in files:
file_path = os.path.join(root, file)
try:
img = Image.open(file_path)
img.verify()
except (IOError, SyntaxError) as e:
print(f"Bad file: {file_path}")
os.remove(file_path)
dataset_path = '/content/drive/MyDrive/food_dataset'
check_images(dataset_path)
Step 3: Data Preprocessing
This step involves setting up the data generators that will be used to feed images into the model during training. The images are rescaled, and the data is split into training and validation sets.
Python
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
validation_split=0.3
)
train_gen = train_datagen.flow_from_directory(
dataset_path,
target_size=(224, 224),
class_mode='sparse',
batch_size=32,
subset='training',
shuffle=True
)
validation_gen = train_datagen.flow_from_directory(
dataset_path,
target_size=(224, 224),
class_mode='sparse',
batch_size=32,
subset='validation',
shuffle=True
)
print(train_gen.class_indices)
print(validation_gen.class_indices)
Output:
{'Aloo Puri': 0, 'Bhindi Masala': 1, 'Chhole Bhature': 2, 'Dal Bati Churma': 3, 'Dal Makhni': 4, 'Dhokla': 5, 'Gulab Jamun': 6, 'Idli Sambhar': 7, 'Jalebi': 8, 'Kheer': 9, 'Mushroom': 10, 'Paneer': 11, 'Pav Bhaji': 12, 'Poha': 13, 'Rajma Chawal': 14, 'Rasgulla': 15, 'Rasmalai': 16, 'Sarson ka Saag Makki ki Roti': 17, 'Thepla': 18, 'Vada Pav': 19}
{'Aloo Puri': 0, 'Bhindi Masala': 1, 'Chhole Bhature': 2, 'Dal Bati Churma': 3, 'Dal Makhni': 4, 'Dhokla': 5, 'Gulab Jamun': 6, 'Idli Sambhar': 7, 'Jalebi': 8, 'Kheer': 9, 'Mushroom': 10, 'Paneer': 11, 'Pav Bhaji': 12, 'Poha': 13, 'Rajma Chawal': 14, 'Rasgulla': 15, 'Rasmalai': 16, 'Sarson ka Saag Makki ki Roti': 17, 'Thepla': 18, 'Vada Pav': 19}
Step 4: Model Creation with Transfer Learning
In this step, a pre-trained model (MobileNetV2) is used as the base model. The layers of this model are frozen initially to retain the learned features. A custom classification head is added to the model for the specific task of classifying 20 different classes of food.
Python
from tensorflow.keras import layers, models
from tensorflow.keras.applications import MobileNetV2
reference_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Freezing the layers
for layer in reference_model.layers:
layer.trainable = False
model = models.Sequential([
reference_model,
layers.Flatten(),
layers.BatchNormalization(),
layers.Dense(units=128, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.2),
layers.Dense(units=20, activation='softmax')
])
Step 5: Model Compilation and Training
The model is compiled with an Adam optimizer, a loss function for categorical classification, and accuracy as the evaluation metric. Early stopping is used to prevent overfitting. Then, the model is trained on the training data and validated on the validation data.
Python
import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
checkpoint_callback = ModelCheckpoint(
filepath='/content/drive/MyDrive/saved_model/model_epoch_{epoch:02d}.h5',
monitor='val_accuracy',
save_best_only=True,
save_freq='epoch',
)
trained_model = model.fit(
train_gen,
epochs=10,
validation_data=validation_gen,
callbacks=[checkpoint_callback]
)
Output:
Epoch 1/10
30/85 [=========>....................] - ETA: 30s - loss: 1.6956 - accuracy: 0.5208/usr/local/lib/python3.10/dist-packages/PIL/Image.py:996: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images
warnings.warn(
85/85 [==============================] - 83s 936ms/step - loss: 1.2460 - accuracy: 0.6422 - val_loss: 1.0757 - val_accuracy: 0.7273
Epoch 2/10
85/85 [==============================] - 69s 816ms/step - loss: 0.1729 - accuracy: 0.9774 - val_loss: 0.8590 - val_accuracy: 0.7552
Epoch 3/10
85/85 [==============================] - 69s 820ms/step - loss: 0.0577 - accuracy: 0.9978 - val_loss: 0.8410 - val_accuracy: 0.7552
Epoch 4/10
85/85 [==============================] - 70s 822ms/step - loss: 0.0266 - accuracy: 0.9993 - val_loss: 0.8282 - val_accuracy: 0.7657
Epoch 5/10
85/85 [==============================] - 68s 806ms/step - loss: 0.0185 - accuracy: 0.9996 - val_loss: 0.8367 - val_accuracy: 0.7649
Epoch 6/10
85/85 [==============================] - 70s 825ms/step - loss: 0.0109 - accuracy: 0.9996 - val_loss: 0.8370 - val_accuracy: 0.7649
Epoch 7/10
85/85 [==============================] - 70s 827ms/step - loss: 0.0106 - accuracy: 0.9996 - val_loss: 0.8666 - val_accuracy: 0.7605
Epoch 8/10
85/85 [==============================] - 70s 828ms/step - loss: 0.0095 - accuracy: 0.9996 - val_loss: 0.8606 - val_accuracy: 0.7561
Epoch 9/10
85/85 [==============================] - 68s 800ms/step - loss: 0.0052 - accuracy: 1.0000 - val_loss: 0.8555 - val_accuracy: 0.7570
Epoch 10/10
85/85 [==============================] - 68s 808ms/step - loss: 0.0072 - accuracy: 1.0000 - val_loss: 0.8684 - val_accuracy: 0.7579
Step 6: Fine-tuning the Model
After initial training, some or all layers of the base model are unfrozen, and the model is recompiled with a lower learning rate using the SGD optimizer. This process, called fine-tuning, helps improve the model's performance.
Python
# Unfreezing some layers
for layer in reference_model.layers:
layer.trainable = True
model.compile(
optimizer=tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.9),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
Step 7: Saving the Final Model
After training, the final model is saved to Google Drive for later use.
Python
saved_model = model.save('/content/drive/MyDrive/saved_model/final.keras')
Step 8: Loading the Model and Making Predictions
The saved model is loaded from Google Drive, and a function is defined to predict the class of a new image.
Python
from keras.models import load_model
from keras.preprocessing import image
import numpy as np
# Function to predict images
def predict_image(image_path, saved_model):
img = image.load_img(image_path, target_size=(224, 224))
img = image.img_to_array(img)
img = img / 255.0
img = np.expand_dims(img, axis=0)
prediction = saved_model.predict(img)
classes = ['Aloo Puri', 'Bhindi Masala', 'Chhole Bhature', 'Dal Bati Churma', 'Dal Makhni', 'Dhokla', 'Gulab Jamun', 'Idli Sambhar', 'Jalebi', 'Kheer', 'Mushroom', 'Paneer', 'Pav Bhaji', 'Poha', 'Rajma Chawal', 'Rasgulla', 'Rasmalai', 'Sarson ka Saag Makki ki Roti', 'Thepla', 'Vada Pav']
predicted_class = classes[np.argmax(prediction)]
return predicted_class
saved_model = load_model('/content/drive/MyDrive/saved_model/final.keras')
image_path = '/content/drive/MyDrive/new.jfif'
predicted_class = predict_image(image_path, saved_model)
print(predicted_class)
Output:
Jalebi
Step 9: Model Evaluation
The model is evaluated on the validation dataset, and a classification report is generated to see how well the model performed.
Python
from sklearn.metrics import classification_report
y_predict = model.predict(validation_gen)
y_true = validation_gen.classes
print(classification_report(y_true, y_predict.argmax(axis=1)))
Output:
precision recall f1-score support
0 0.03 0.02 0.03 46
1 0.06 0.07 0.06 72
2 0.07 0.08 0.07 79
3 0.08 0.08 0.08 91
4 0.08 0.07 0.07 46
5 0.09 0.09 0.09 97
6 0.04 0.04 0.04 51
7 0.07 0.07 0.07 90
8 0.02 0.03 0.02 77
9 0.05 0.05 0.05 60
10 0.03 0.02 0.02 63
11 0.09 0.12 0.10 100
12 0.00 0.00 0.00 12
13 0.00 0.00 0.00 4
14 0.09 0.09 0.09 91
16 0.00 0.00 0.00 1
17 0.06 0.06 0.06 90
18 0.08 0.08 0.08 74
accuracy 0.07 1144
macro avg 0.05 0.05 0.05 1144
weighted avg 0.06 0.07 0.06 1144
Step 10: Plotting Training and Validation Loss/Accuracy
This step involves plotting the training and validation loss and accuracy over the epochs to visualize the model's performance.
Python
import matplotlib.pyplot as plt
training_loss = trained_model.history['loss']
test_loss = trained_model.history['val_loss']
training_acc = trained_model.history['accuracy']
test_acc = trained_model.history['val_accuracy']
epochs = range(len(training_loss))
plt.title('Loss vs Epoch')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.plot(epochs, training_loss, label='Training Loss')
plt.plot(epochs, test_loss, label='Validation Loss')
plt.legend()
plt.show()
Output:
Why MobileNetV2 Stood Out?
Among the three architectures tested—VGG16, ResNet50, and MobileNetV2—MobileNetV2 emerged as the best-performing model. Its efficient architecture, designed for resource-constrained environments, allowed it to achieve high accuracy while maintaining lower computational costs. This makes MobileNetV2 an excellent choice not only for large-scale classification tasks but also for deployment on devices with limited processing power.
Conclusion
The journey of building an Indian cuisine classification model is as much about understanding the intricacies of deep learning as it is about appreciating the diversity of Indian food. By employing advanced techniques like transfer learning, fine-tuning, and hyperparameter optimization, and by comparing different architectures, we were able to create a robust model capable of classifying a wide range of Indian dishes, with MobileNetV2 leading the way in performance.
Similar Reads
Dog Breed Classification using Transfer Learning In this tutorial, we will demonstrate how to build a dog breed classifier using transfer learning. This method allows us to use a pre-trained deep learning model and fine-tune it to classify images of different dog breeds. Why to use Transfer Learning for Dog Breed ClassificationTransfer learning is
9 min read
Multiclass image classification using Transfer learning Image classification is one of the supervised machine learning problems which aims to categorize the images of a dataset into their respective categories or labels. Classification of images of various dog breeds is a classic image classification problem. So, we have to classify more than one class t
8 min read
Cat & Dog Classification using Convolutional Neural Network in Python Convolutional Neural Networks (CNNs) are a type of deep learning model specifically designed for processing images. Unlike traditional neural networks CNNs uses convolutional layers to automatically and efficiently extract features such as edges, textures and patterns from images. This makes them hi
5 min read
Music Genre Classification using Transformers All the animals have a reaction to music. Music is a special thing that has an effect directly on our brains. Humans are the creators of different types of music, like pop, hip-hop, rap, classical, rock, and many more. Specifically, music can be classified by its genres. Our brains can detect differ
5 min read
Classification Using Sklearn Multi-layer Perceptron Multi-Layer Perceptrons (MLPs) are a type of neural network commonly used for classification tasks where the relationship between features and target labels is non-linear. They are particularly effective when traditional linear models are insufficient to capture complex patterns in data. This includ
5 min read
Restricted Boltzmann Machine Features for Digit Classification in Scikit Learn Restricted Boltzmann Machine (RBM) is a type of artificial neural network that is used for unsupervised learning and generative modeling. Â They follow the rules of energy minimization and are quite similar to probabilistic graphical models. Hopfield networks, which store memories, served as the basi
7 min read