Zomato Data Analysis Using Python

Last Updated : 28 Jul, 2025

Understanding customer preferences and restaurant trends is important for making informed business decisions in food industry. In this article, we will analyze Zomato’s restaurant dataset using Python to find meaningful insights. We aim to answer questions such as:

Do more restaurants provide online delivery compared to offline services?
Which types of restaurants are most favored by the general public?
What price range do couples prefer for dining out?

Implementation for Zomato Data Analysis using Python.

Below steps are followed for its implementation.

Step 1: Importing necessary Python libraries.

We will be using Pandas, Numpy, Matplotlib and Seaborn libraries.

Python

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

Step 2: Creating the data frame.

You can download the dataset from here.

Python

dataframe = pd.read_csv("/content/Zomato-data-.csv")
print(dataframe.head())

Output:

Step 3: Data Cleaning and Preparation

Before moving further we need to clean and process the data.

1. Convert the rate column to a float by removing denominator characters.

dataframe['rate']=dataframe['rate'].apply(handleRate): Applies the handleRate function to clean and convert each rating value in the 'rate' column.

Python

def handleRate(value):
    value=str(value).split('/')
    value=value[0];
    return float(value)

dataframe['rate']=dataframe['rate'].apply(handleRate)
print(dataframe.head())

Output:

zomato2 — Converting rate column to float

2. Getting summary of the dataframe use df.info().

Python

dataframe.info()

Output:

3. Checking for missing or null values to identify any data gaps.

Python

print(dataframe.isnull().sum())

Output:

Screenshot-2025-07-28-160036 — null values

There is no NULL value in dataframe.

Step 4: Exploring Restaurant Types

1. Let's see the listed_in (type) column to identify popular restaurant categories.

Python

sns.countplot(x=dataframe['listed_in(type)'])
plt.xlabel("Type of restaurant")

Output:

Types of Restaurant Count-Geeksforgeeks

Conclusion: The majority of the restaurants fall into the dining category.

2. Votes by Restaurant Type

Here we get the count of votes for each category.

Python

grouped_data = dataframe.groupby('listed_in(type)')['votes'].sum()
result = pd.DataFrame({'votes': grouped_data})
plt.plot(result, c='green', marker='o')
plt.xlabel('Type of restaurant')
plt.ylabel('Votes')

Output:

Conclusion: Dining restaurants are preferred by a larger number of individuals.

Step 5: Identify the Most Voted Restaurant

Find the restaurant with the highest number of votes.

Python

max_votes = dataframe['votes'].max()
restaurant_with_max_votes = dataframe.loc[dataframe['votes'] == max_votes, 'name']

print('Restaurant(s) with the maximum votes:')
print(restaurant_with_max_votes)

Output:

Step 6: Online Order Availability

Exploring the online_order column to see how many restaurants accept online orders.

Python

sns.countplot(x=dataframe['online_order'])

Output:

Online vs Offline Order-Geeksforgeeks

Conclusion: This suggests that a majority of the restaurants do not accept online orders.

Step 7: Analyze Ratings

Checking the distribution of ratings from the rate column.

Python

plt.hist(dataframe['rate'],bins=5)
plt.title('Ratings Distribution')
plt.show()

Output:

Rating DIstribution-Geeksforgeeks

Conclusion: The majority of restaurants received ratings ranging from 3.5 to 4.

Step 8: Approximate Cost for Couples

Analyze the approx_cost(for two people) column to find the preferred price range.

Python

couple_data=dataframe['approx_cost(for two people)']
sns.countplot(x=couple_data)

Output:

approx_cost(for two people)-Geeksforgeeks

Conclusion: The majority of couples prefer restaurants with an approximate cost of 300 rupees.

Step 9: Ratings Comparison - Online vs Offline Orders

Compare ratings between restaurants that accept online orders and those that don't.

Python

plt.figure(figsize = (6,6))
sns.boxplot(x = 'online_order', y = 'rate', data = dataframe)

Output:

Box Plot-Geeksforgeeks

Conclusion: Offline orders received lower ratings in comparison to online orders which obtained excellent ratings.

Step 10: Order Mode Preferences by Restaurant Type

Find the relationship between order mode (online_order) and restaurant type (listed_in(type)).

pivot_table = dataframe.pivot_table(index='listed_in(type)', columns='online_order', aggfunc='size', fill_value=0): Creates a pivot table counting restaurants by type and online order availability.

Python

pivot_table = dataframe.pivot_table(index='listed_in(type)', columns='online_order', aggfunc='size', fill_value=0)
sns.heatmap(pivot_table, annot=True, cmap='YlGnBu', fmt='d')
plt.title('Heatmap')
plt.xlabel('Online Order')
plt.ylabel('Listed In (Type)')
plt.show()

Output:

Heatmap-Geeksforgeeks

With this we can say that dining restaurants primarily accept offline orders whereas cafes primarily receive online orders. This suggests that clients prefer to place orders in person at restaurants but prefer online ordering at cafes.

You can download the source code from here: Zomato Data Analysis

9971jiyagarg

Improve

Article Tags :

Zomato Data Analysis Using Python

Implementation for Zomato Data Analysis using Python.

Step 1: Importing necessary Python libraries.

Step 2: Creating the data frame.

Step 3: Data Cleaning and Preparation

Step 4: Exploring Restaurant Types

Step 5: Identify the Most Voted Restaurant

Step 6: Online Order Availability

Step 7: Analyze Ratings

Step 8: Approximate Cost for Couples

Step 9: Ratings Comparison - Online vs Offline Orders

Step 10: Order Mode Preferences by Restaurant Type

Explore

Introduction to Machine Learning

Python for Machine Learning

Introduction to Statistics

Feature Engineering

Model Evaluation and Tuning

Data Science Practice

Thank You!

What kind of Experience do you want to share?