Manipulating Pandas DataFrame using Mitosheet
Last Updated :
09 Jan, 2023
In this article, we will put some light on the application named Mitosheet and how we can use it to manipulate Pandas Dataframe easily.
PrerequisiteÂ
The user must be familiar with Pandas in Python.
What are Mito and Mitosheet?
Mito is a Data Analysis tool mainly used by data scientists to speed up data analysis using Pandas faster than the traditional way. It is developed by Jacob Diamond Reivich, it's the interface is just like Excel Spreadsheets but it provides a lot more additional benefits, especially for Data Scientists.
Mitosheet is the interface provided by Mito using which we can Manipulate Pandas Dataframe easily.
Installing the required modules -
To be able to use Mitosheet we either need to have Jupyter Notebook or JupyterLab (Google Colab doesn't support this). User can also install it and use it from the windows command prompt.
To install it follow the below steps -
Step 1: Write the following command in the Command Prompt terminal.
pip install mitoinstaller
Step 2: After it has finished then write the below command to install and start the Mitosheet-enabled Jupyterlab. Before executing the following command it is recommended to change the location from cmd to the directory where the dataset i.e the csv file resides.
python -m mitoinstaller install
This might take some time (depending on your internet speed and device configuration). After its completion, it will automatically open a Jupyterlab page in the default browser and it will look like the following.
A button like Create New Mitosheet will be visible.
Click on Create New Mitosheet or run the pre-created cell, if that doesn't work or generate Mitosheet then click on the button. The following code will be generated automatically and this output will be shown for the first time. Provide your email and any name for the Organization it doesn't matter.
After filling in the details click Sign Up.
The following empty Mitosheet interface opens -
Now using Pandas we will import a dataset that will be visible in the Mitosheets.
Importing a dataset using Mitosheet
Now after signing up we will import a CSV file using Mitosheet.
 After clicking Create New Mitosheet you will see in the Mitosheet interface there is a button called Import.
Click on that and  If you have changed the directory location earlier before running the python -m mitoinstaller install command then Import will open that directory where the data resides, otherwise, you need to change the directory manually from the Mitosheet import interface and open the file location and then double click on it.
When you import it you will see there will be an auto-generated pandas code in the below cell. This happens as the changes we will make in the Mitosheet the respective code will be auto-generated in the below cells.
This code auto-generated in the below cell after importing the dataset.
Mito Interface after importing and loading a dataset.
Making changes in the Mitosheet
Now the changes we will make directly in the Mitosheet will be reflected in the below cell, the equivalent pandas code will be generated automatically.
Changing the Column name -
We will change the column name of the first column from Unnamed : 0 to Index directly in the Mitosheet and that equivalent code will be auto-generated in its following cell.
First, select the column whose header we want to change, then double click in the marked area and type the new name and press enter. It's equivalent Python code will be generated in the following cell.
before changing
 After changing the header and the auto-generated code -
Changed header
Autogenerated codeChanging the datatype of a Column -
We can also change the data type of a certain column and that code will be auto-generated too.
Click on the Column whose datatype you want to change, then click on the data type, a new dialog box like below will open -
Now change the Dtype to anything new (select from the dropdown menu). If you want you can also change the NumType from drop down menu to Plain Text, Currency, Percentage etc. Even we can Sort the values in Ascending or Descending order.
All the changes made have been generated, changed the datatype to float, then sorted in Descending order and as it changed to float it has been formatted to 2 decimal places.
Adding a Filter to a Column -
We can add any Filter or a Group of Filter to a Certain Column -Â
Click on the Funnel Looking icon of any column in which we want to add filter.
Then add any type of filter of user's choice whose code will be generated.
Adding a Filter where only the values greater than 600 will be shown. This not only affects that Column but also the Entire Table / Dataset.
Auto-generated code.
Plotting a Graph
We can easily plot a graph using this.
Click on the graph icon and this interface opens. Here we will choose the Dataframe which we want to plot (This is helpful when we have more than one data frame).
Available types of Charts can be plotted.
We can also change the style of the plots minutely and also export it as png or copy the code which produces the Graph.

Other Features that are available -
Mitosheet consists of all the features that are possible using pandas Dataframe. We can also add a new column to the existing dataframe , Delete any column, export that into either CSV or excel file, make a Pivot table, merge more than one dataframe, and so on.
It doesn't consist of any new features other than the Pandas one, it is just a new GUI that supports all the Pandas functions so those who are not that familiar with Python or Pandas can easily do Data Analysis without the need to write any less to no code.
All the generated codesÂ
from mitosheet import *; register_analysis("id-ybvoydxqbd");
# Imported dataset.csv
import pandas as pd
dataset = pd.read_csv(r'E:\Geeks_Articles\dataset.csv')
# Renamed columns index
dataset.rename(columns={'Unnamed: 0': 'index'}, inplace=True)
# Changed Country to dtype float
dataset['Country'] = dataset['Country'].astype('float')
# Sorted Country in descending order
dataset = dataset.sort_values(by='Country', ascending=False, na_position='last')
# Filtered Country
dataset = dataset[dataset['Country'] > 600]
# Formatted dataframes. View these styling objects to see the formatted dataframe
dataset_styler = dataset.style\
.format("${:.2f}", subset=['Country'])
Similar Reads
String manipulations in Pandas DataFrame
String manipulation is the process of changing, parsing, splicing, pasting or analyzing strings. As we know that sometimes data in the string is not suitable for manipulating the analysis or get a description of the data. But Python is known for its ability to manipulate strings. In this article we
4 min read
Manipulating DataFrames with Pandas - Python
Before manipulating the dataframe with pandas we have to understand what is data manipulation. The data in the real world is very unpleasant & unordered so by performing certain operations we can make data understandable based on one's requirements, this process of converting unordered data into
4 min read
Memory leak using Pandas DataFrame
Pandas is a powerful and widely-used open-source data analysis and manipulation library for Python. It provides a DataFrame object that allows you to store and manipulate tabular data in rows and columns in a very intuitive way. Pandas DataFrames are powerful tools for working with data, but they ca
9 min read
Data Manipulation in Python using Pandas
In Machine Learning, the model requires a dataset to operate, i.e. to train and test. But data doesnât come fully prepared and ready to use. There are discrepancies like Nan/ Null / NA values in many rows and columns. Sometimes the data set also contains some of the rows and columns which are not ev
6 min read
R - DataFrame Manipulation
Data Frame is a two-dimensional structured entity consisting of rows and columns. It consists equal length vectors as rows. The data is stored in cells which are accessed by specifying the corresponding [row, col] set of values of the data frame. Manipulation of data frames involve modifying, extrac
9 min read
How to Join Pandas DataFrames using Merge?
Joining and merging DataFrames is that the core process to start  out with data analysis and machine learning tasks. It's one of the toolkits which each Data Analyst or Data Scientist should master because in most cases data comes from multiple sources and files. In this tutorial, you'll how to join
3 min read
Creating Pandas dataframe using list of lists
In this article, we will explore the Creating Pandas data frame using a list of lists. A Pandas DataFrame is a versatile 2-dimensional labeled data structure with columns that can contain different data types. It is widely utilized as one of the most common objects in the Pandas library. There are v
4 min read
DataFrame.to_excel() method in Pandas
The to_excel() method is used to export the DataFrame to the excel file. Â To write a single object to the excel file, we have to specify the target file name. If we want to write to multiple sheets, we need to create an ExcelWriter object with target filename and also need to specify the sheet in th
3 min read
Pandas DataFrame.loc[] Method
Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. It can be thought of as a dict-like container for Series objects. This is the primary data structure o
6 min read
Python | Pandas dataframe.mask()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.mask() function return an object of same shape as self and whose corr
3 min read