How to use Hierarchical Indexes with Pandas ?
Last Updated :
08 May, 2021
The index is like an address, that’s how any data point across the data frame or series can be accessed. Rows and columns both have indexes, rows indices are called index and for columns, it's general column names.
Hierarchical IndexesÂ
Hierarchical Indexes are also known as multi-indexing is setting more than one column name as the index. In this article, we are going to use homelessness.csv file.
Python3
# importing pandas library as alias pd
import pandas as pd
# calling the pandas read_csv() function.
# and storing the result in DataFrame df
df = pd.read_csv('homelessness.csv')
print(df.head())
Output:
In the following data frame, there is no indexing.
Columns in the Dataframe:
Python3
# using the pandas columns attribute.
col = df.columns
print(col)
Output:
Index(['Unnamed: 0', 'region', 'state', 'individuals', 'family_members',
'state_pop'],
dtype='object')
To make the column an index, we use the Set_index() function of pandas. If we want to make one column an index, we can simply pass the name of the column as a string in set_index(). If we want to do multi-indexing or Hierarchical Indexing, we pass the list of column names in the set_index().
Below Code demonstrates Hierarchical Indexing in pandas:
Python3
# using the pandas set_index() function.
df_ind3 = df.set_index(['region', 'state', 'individuals'])
# we can sort the data by using sort_index()
df_ind3.sort_index()
print(df_ind3.head(10))
 Output:
Now the dataframe is using Hierarchical Indexing or multi-indexing.
Note that here we have made 3 columns as an index ('region', 'state', 'individuals' ). The first index 'region' is called level(0) index, which is on top of the Hierarchy of indexes, next index 'state' is level(1) index which is below the main or level(0) index, and so on. So, the Hierarchy of indexes is formed that's why this is called Hierarchical indexing.
We may sometimes need to make a column as an index, or we want to convert an index column into the normal column, so there is a pandas reset_index(inplace = True) function, which makes the index column the normal column.
Selecting Data in a Hierarchical Index or using the Hierarchical Indexing:
For selecting the data from the dataframe using the .loc() method we have to pass the name of the indexes in a list.
Python3
# selecting the 'Pacific' and 'Mountain'
# region from the dataframe.
# selecting data using level(0) index or main index.
df_ind3_region = df_ind3.loc[['Pacific', 'Mountain']]
print(df_ind3_region.head(10))
Output:
We cannot use only level(1) index for getting data from the dataframe, if we do so it will give an error. We can only use level (1) index or the inner indexes with the level(0) or main index with the help list of tuples.
Python3
# using the inner index 'state' for getting data.
df_ind3_state = df_ind3.loc[['Alaska', 'California', 'Idaho']]
print(df_ind3_state.head(10))
 Output:
Using inner levels indexes with the help of a list of tuples:
Syntax:Â
df.loc[[ ( level( 0 ) , level( 1 ) , level( 2 ) ) ]]
Python3
# selecting data by passing all levels index.
df_ind3_region_state = df_ind3.loc[[("Pacific", "Alaska", 1434),
("Pacific", "Hawaii", 4131),
("Mountain", "Arizona", 7259),
("Mountain", "Idaho", 1297)]]
df_ind3_region_state
Output:
Similar Reads
Python | Pandas MultiIndex.to_hierarchical()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas MultiIndex.to_hierarchical() function return a MultiIndex reshaped to conform t
2 min read
How to Change Index Values in Pandas?
Index is used to uniquely identify a row in Pandas DataFrame. It is nothing but a label to a row. If we didn't specify index values to the DataFrame while creation then it will take default values i.e. numbers starting from 0 to n-1 where n indicates a number of rows. Let's create a dataframe Exampl
3 min read
Hierarchical data in Pandas
In pandas, we can arrange data within the data frame from the existing data frame. For example, we are having the same name with different features, instead of writing the name all time, we can write only once. We can create hierarchical data from the existing data frame using pandas. Example: See t
2 min read
Python | Pandas DataFrame.set_index()
Pandas DataFrame.set_index() method sets one or more columns as the index of a DataFrame. It can accept single or multiple column names and is useful for modifying or adding new indices to your DataFrame. By doing so, you can enhance data retrieval, indexing, and merging tasks.Syntax: DataFrame.set_
3 min read
How to reset index after Groupby pandas?
In pandas, groupby() is used to group data based on specific criteria, allowing for operations like aggregation, transformation and filtering. However, after applying groupby(), the resulting DataFrame often has a MultiIndex or a non-sequential index, which can make data handling more complex. Reset
3 min read
How to flatten a hierarchical index in Pandas DataFrame columns?
In this article, we are going to see the flatten a hierarchical index in Pandas DataFrame columns. Hierarchical Index usually occurs as a result of groupby() aggregation functions. Flatten hierarchical index in Pandas, the aggregated function used will appear in the hierarchical index of the resulti
6 min read
Python | Pandas Index.is_categorical()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas Index.is_categorical() function checks if the index holds categorical data. Cat
2 min read
How to Read JSON Files with Pandas?
JSON (JavaScript Object Notation) store data using key-value pairs. Reading JSON files using Pandas is simple and helpful when you're working with data in .json format. There are mainly three methods to read Json file using Pandas Some of them are:Using pd.read_json() MethodUsing JSON Module and pd.
2 min read
How to Merge Two Pandas DataFrames on Index
Merging two pandas DataFrames on their index is necessary when working with datasets that share the same row identifiers but have different columns. The core idea is to align the rows of both DataFrames based on their indices, combining the respective columns into one unified DataFrame. To merge two
3 min read
How to combine two DataFrames in Pandas?
While working with data, there are multiple times when you would need to combine data from multiple sources. For example, you may have one DataFrame that contains information about a customer, while another DataFrame contains data about their transaction history. If you want to analyze this data tog
3 min read