Zomato Data  Analysis Using Python
                                        
                                                                                    
                                                
                                                    Last Updated : 
                                                    28 Jul, 2025
                                                
                                                 
                                                 
                                             
                                                                             
                                                             
                            
                            
                                                                                    
                Understanding customer preferences and restaurant trends is important for making informed business decisions in food industry. In this article, we will analyze Zomato’s restaurant dataset using Python to find meaningful insights. We aim to answer questions such as:
- Do more restaurants provide online delivery compared to offline services?
- Which types of restaurants are most favored by the general public?
- What price range do couples prefer for dining out?
Implementation for Zomato Data Analysis using Python.
Below steps are followed for its implementation.
Step 1: Importing necessary Python libraries.
We will be using Pandas, Numpy, Matplotlib and Seaborn libraries.
            Python
    import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
Step 2: Creating the data frame.
You can download the dataset from here.
            Python
    dataframe = pd.read_csv("/content/Zomato-data-.csv")
print(dataframe.head())
Output:
 Dataset
DatasetStep 3: Data Cleaning and Preparation
Before moving further we need to clean and process the data.
1. Convert the rate column to a float by removing denominator characters.
- dataframe['rate']=dataframe['rate'].apply(handleRate): Applies the handleRate function to clean and convert each rating value in the 'rate' column.
Pythondef handleRate(value):
    value=str(value).split('/')
    value=value[0];
    return float(value)
dataframe['rate']=dataframe['rate'].apply(handleRate)
print(dataframe.head())
Output:
 Converting rate column to float
Converting rate column to float2. Getting summary of the dataframe use df.info().
            Python
    Output:
 Summary of dataset
Summary of dataset3. Checking for missing or null values to identify any data gaps.
            Python
    print(dataframe.isnull().sum())
Output:
 null values
null valuesThere is no NULL value in dataframe.
Step 4: Exploring Restaurant Types
1. Let's see the listed_in (type) column to identify popular restaurant categories.
            Python
    sns.countplot(x=dataframe['listed_in(type)'])
plt.xlabel("Type of restaurant")
Output:

Conclusion: The majority of the restaurants fall into the dining category.
2. Votes by Restaurant Type
Here we get the count of votes for each category.
            Python
    grouped_data = dataframe.groupby('listed_in(type)')['votes'].sum()
result = pd.DataFrame({'votes': grouped_data})
plt.plot(result, c='green', marker='o')
plt.xlabel('Type of restaurant')
plt.ylabel('Votes')
Output:
 
 Conclusion: Dining restaurants are preferred by a larger number of individuals.
Step 5: Identify the Most Voted Restaurant
Find the restaurant with the highest number of votes.
            Python
    max_votes = dataframe['votes'].max()
restaurant_with_max_votes = dataframe.loc[dataframe['votes'] == max_votes, 'name']
print('Restaurant(s) with the maximum votes:')
print(restaurant_with_max_votes)
Output:
 Highest number of votes
Highest number of votesStep 6: Online Order Availability
Exploring the online_order column to see how many restaurants accept online orders.
            Python
    sns.countplot(x=dataframe['online_order'])
Output:

Conclusion: This suggests that a majority of the restaurants do not accept online orders.
Step 7: Analyze Ratings
Checking the distribution of ratings from the rate column.
            Python
    plt.hist(dataframe['rate'],bins=5)
plt.title('Ratings Distribution')
plt.show()
Output:

Conclusion: The majority of restaurants received ratings ranging from 3.5 to 4.
Step 8: Approximate Cost for Couples
Analyze the approx_cost(for two people) column to find the preferred price range.
            Python
    couple_data=dataframe['approx_cost(for two people)']
sns.countplot(x=couple_data)
Output:

Conclusion: The majority of couples prefer restaurants with an approximate cost of 300 rupees.
Step 9: Ratings Comparison - Online vs Offline Orders
Compare ratings between restaurants that accept online orders and those that don't.
            Python
    plt.figure(figsize = (6,6))
sns.boxplot(x = 'online_order', y = 'rate', data = dataframe)
Output:

Conclusion: Offline orders received lower ratings in comparison to online orders which obtained excellent ratings.
Step 10: Order Mode Preferences by Restaurant Type
Find the relationship between order mode (online_order) and restaurant type (listed_in(type)).
- pivot_table = dataframe.pivot_table(index='listed_in(type)', columns='online_order', aggfunc='size', fill_value=0): Creates a pivot table counting restaurants by type and online order availability.
Pythonpivot_table = dataframe.pivot_table(index='listed_in(type)', columns='online_order', aggfunc='size', fill_value=0)
sns.heatmap(pivot_table, annot=True, cmap='YlGnBu', fmt='d')
plt.title('Heatmap')
plt.xlabel('Online Order')
plt.ylabel('Listed In (Type)')
plt.show()
Output:

With this we can say that dining restaurants primarily accept offline orders whereas cafes primarily receive online orders. This suggests that clients prefer to place orders in person at restaurants but prefer online ordering at cafes.
You can download the source code from here: Zomato Data Analysis 
                                
                                
                            
                                                                                
                                                            
                                                    
                                                
                                                        
                            
                        
                                                
                        
                                                                                    
                                                                Explore
                                    
                                        Introduction to Machine Learning
Python for Machine Learning
Introduction to Statistics
Feature Engineering
Model Evaluation and Tuning
Data Science Practice