Zomato Data Analysis Using Python
Last Updated :
16 May, 2025
Understanding customer preferences and restaurant trends is important for making informed business decisions in food industry. In this article, we will analyze Zomato’s restaurant dataset using Python to find meaningful insights. We aim to answer questions such as:
- Do more restaurants provide online delivery compared to offline services?
- Which types of restaurants are most favored by the general public?
- What price range do couples prefer for dining out?
Implementation for Zomato Data Analysis using Python.
Below steps are followed for its implementation.
Step 1: Importing necessary Python libraries.
We will be using Pandas, Numpy, Matplotlib and Seaborn libraries.
Python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
Step 2: Creating the data frame.
You can download the dataset from here.
Python
dataframe = pd.read_csv("/content/Zomato-data-.csv")
print(dataframe.head())
Output:
DatasetStep 3: Data Cleaning and Preparation
Before moving further we need to clean and process the data.
1. Convert the rate column to a float by removing denominator characters.
- dataframe['rate']=dataframe['rate'].apply(handleRate): Applies the handleRate function to clean and convert each rating value in the 'rate' column.
Python
def handleRate(value):
value=str(value).split('/')
value=value[0];
return float(value)
dataframe['rate']=dataframe['rate'].apply(handleRate)
print(dataframe.head())
Output:
Converting rate column to float2. Getting summary of the dataframe use df.info().
Python
Output:
Summary of dataset3. Checking for missing or null values to identify any data gaps.
Conclusion: There is no NULL value in dataframe.
Step 4: Exploring Restaurant Types
1. Let's see the listed_in (type) column to identify popular restaurant categories.
Python
sns.countplot(x=dataframe['listed_in(type)'])
plt.xlabel("Type of restaurant")
Output:

Conclusion: The majority of the restaurants fall into the dining category.
2. Votes by Restaurant Type
Here we get the count of votes for each category.
Python
grouped_data = dataframe.groupby('listed_in(type)')['votes'].sum()
result = pd.DataFrame({'votes': grouped_data})
plt.plot(result, c='green', marker='o')
plt.xlabel('Type of restaurant', c='red', size=20)
plt.ylabel('Votes', c='red', size=20)
Output:
Text(0, 0.5, 'Votes')

Conclusion: Dining restaurants are preferred by a larger number of individuals.
Step 5: Identify the Most Voted Restaurant
Find the restaurant with the highest number of votes.
Python
max_votes = dataframe['votes'].max()
restaurant_with_max_votes = dataframe.loc[dataframe['votes'] == max_votes, 'name']
print('Restaurant(s) with the maximum votes:')
print(restaurant_with_max_votes)
Output:
Highest number of votesStep 6: Online Order Availability
Exploring the online_order column to see how many restaurants accept online orders.
Python
sns.countplot(x=dataframe['online_order'])
Output:

Conclusion: This suggests that a majority of the restaurants do not accept online orders.
Step 7: Analyze Ratings
Checking the distribution of ratings from the rate column.
Python
plt.hist(dataframe['rate'],bins=5)
plt.title('Ratings Distribution')
plt.show()
Output:

Conclusion: The majority of restaurants received ratings ranging from 3.5 to 4.
Step 8: Approximate Cost for Couples
Analyze the approx_cost(for two people) column to find the preferred price range.
Python
couple_data=dataframe['approx_cost(for two people)']
sns.countplot(x=couple_data)
Output:

Conclusion: The majority of couples prefer restaurants with an approximate cost of 300 rupees.
Step 9: Ratings Comparison - Online vs Offline Orders
Compare ratings between restaurants that accept online orders and those that don't.
Python
plt.figure(figsize = (6,6))
sns.boxplot(x = 'online_order', y = 'rate', data = dataframe)
Output:

Conclusion: Offline orders received lower ratings in comparison to online orders which obtained excellent ratings.
Step 10: Order Mode Preferences by Restaurant Type
Find the relationship between order mode (online_order) and restaurant type (listed_in(type)).
- pivot_table = dataframe.pivot_table(index='listed_in(type)', columns='online_order', aggfunc='size', fill_value=0): Creates a pivot table counting restaurants by type and online order availability.
Python
pivot_table = dataframe.pivot_table(index='listed_in(type)', columns='online_order', aggfunc='size', fill_value=0)
sns.heatmap(pivot_table, annot=True, cmap='YlGnBu', fmt='d')
plt.title('Heatmap')
plt.xlabel('Online Order')
plt.ylabel('Listed In (Type)')
plt.show()
Output:

Conclusion: Dining restaurants primarily accept offline orders whereas cafes primarily receive online orders. This suggests that clients prefer to place orders in person at restaurants but prefer online ordering at cafes.
You can download the source code from here: Zomato Data Analysis
Similar Reads
Sequential Data Analysis in Python
Sequential data, often referred to as ordered data, consists of observations arranged in a specific order. This type of data is not necessarily time-based; it can represent sequences such as text, DNA strands, or user actions.In this article, we are going to explore, sequential data analysis, it's t
8 min read
Data Analysis and Visualization with Python | Set 2
Prerequisites : NumPy in Python, Data Analysis Visualization with Python Python is very well known for Data analysis and visualizations because of the vast libraries it provides such as Pandas, Numpy, Matplotlib, etc. Today we will learn some methods to understand our data better and to gain some us
5 min read
Exploratory Data Analysis in Python | Set 1
This article provides a comprehensive guide to performing Exploratory Data Analysis (EDA) using Python focusing on the use of NumPy and Pandas for data manipulation and analysis.Step 1: Setting Up EnvironmentTo perform EDA in Python we need to import several libraries that provide powerful tools for
4 min read
Exploratory Data Analysis in Python | Set 2
In the previous article, we have discussed some basic techniques to analyze the data, now let's see the visual techniques. Let's see the basic techniques - Python3 1== # Loading Libraries import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from scipy.stats im
4 min read
10 Python Pandas tips to make data analysis faster
Data analysis using Python's Pandas library is a powerful process, and its efficiency can be enhanced with specific tricks and techniques. These Python tips will make our code concise, readable, and efficient. The adaptability of Pandas makes it an efficient tool for working with structured data. Wh
15 min read
What is Statistical Analysis in Data Science?
Statistical analysis serves as a cornerstone in the field of data science, providing essential tools and techniques for understanding, interpreting, and making decisions based on data. In this article we are going to learn about the statistical analysis in data science and discuss few types of stati
6 min read
Tweet Sentiment Analysis Using Python Streamlit
This article covers the sentiment analysis of by parsing the tweets fetched from Twitter using the streamlit Python framework. What is Sentiment Analysis? Sentiment Analysis is the process of âcomputationallyâ determining whether a piece of writing is positive, negative or neutral. Itâs also known a
4 min read
Covid-19 Analysis and Visualization using Plotly Express
In this article, we will discuss Analyse Covid-19 data and will visualize it using Plotly Express in Python. This article deals with creating dozens of bar charts, line graphs, bubble charts, scatter plots. The graph that will be made in this project will be of excellent quality. Envisioning COVID-1
12 min read
Active Product Sales Analysis using Matplotlib in Python
Every modern company that engages in online sales or maintains a specialized e-commerce website now aims to maximize its throughput in order to determine what precisely their clients need in order to increase their chances of sales. The huge datasets handed to us can be properly analyzed to find out
3 min read
Learn Data Science Tutorial With Python
Data Science has become one of the fastest-growing fields in recent years, helping organizations to make informed decisions, solve problems and understand human behavior. As the volume of data grows so does the demand for skilled data scientists. The most common languages used for data science are P
3 min read