Reshape Wide DataFrame to Tidy with identifiers using Pandas Melt
Last Updated :
02 Feb, 2021
Sometimes we need to reshape the Pandas data frame to perform analysis in a better way. Reshaping plays a crucial role in data analysis. Pandas provide functions like melt and unmelt for reshaping. In this article, we will see what is Pandas Melt and how to use it to reshape wide to Tidy with identifiers.
Pandas Melt(): Pandas.melt() unpivots a DataFrame from wide format to long format. Pandas melt() function is utilized to change the DataFrame design from wide to long. It is utilized to make a particular configuration of the DataFrame object where at least one segments fill in as identifiers. All the rest of the sections are treated as qualities and unpivoted to the line pivot and just two segments, variable and worth.
Syntax: Pandas.melt(column_level=None, variable_name=None, Value_name=’value’, value_vars=None, id_vars=None, frame)
Parameters:
- frame : DataFrame
- id_vars[tuple, list, or ndarray, optional]: Column(s) to use as identifier variables.
- value_vars[tuple, list, or ndarray, optional]: Column(s) to unpivot. If not specified, uses all columns that are not set as id_vars.
- var_name[scalar]: Name to use for the ‘variable’ column. If None it uses frame.columns.name or ‘variable’.
- value_name[scalar, default ‘value’]: Name to use for the ‘value’ column.
- col_level[int or string, optional]: If columns are a MultiIndex then use this level to melt.
Example 1:
Python3
# Load the libraries
import numpy as np
import pandas as pd
from scipy.stats import poisson
# We will use scipy.stats to create
# random numbers from Poisson distribution.
np.random.seed(seed = 128)
p1 = poisson.rvs(mu = 10, size = 3)
p2 = poisson.rvs(mu = 15, size = 3)
p3 = poisson.rvs(mu = 20, size = 3)
# Declaring the dataframe
data = pd.DataFrame({"P1":p1,
"P2":p2,
"P3":p3})
# Dataframe
print(" Wide Dataframe")
display(data)
data.melt()
# Change the names of the columns
data.melt(var_name = ["Sample"]).head()
# Specify a name for the values
print("\n Tidy Dataframe")
data.melt(var_name = "Sample",
value_name = "Count").head()
Output:
Explanation: In this example, we create three datasets using Poisson distribution and create a data frame using pandas. Then using the melt() function we reshape the data in long-form in two columns and rename the two columns. The first column is called “variable” by default and it contains the column/variable names. And the second column is named “value” and it contains the data from the wide form data frame.
Example 2:
Python3
import pandas as pd
data = pd.DataFrame({'Name': {0: 'Samrat', 1: 'Tomar', 2: 'Verma'},
'Score': {0: '99', 1: '98', 2: '97'},
'Age': {0: 22, 1: 31, 2: 33}})
pd.melt(data, id_vars=['Name'], value_vars=['Score'])
display(pd.melt(data, id_vars=['Name'], value_vars=['Score']))
Output:
Explanation: In this example, we create a data frame using pandas. Then using the melt() function we reshape the data in long-form in three columns and specify the Name as the id and variable as Score the person and the value as their scores. Apart from the "id" column, The first column is called “variable” by default and it contains the column/variable names. And the second column is named “value” and it contains the data from the wide form data frame.
Similar Reads
How to Convert Wide Dataframe to Tidy Dataframe with Pandas stack()? We might sometimes need a tidy/long-form of data for data analysis. So, in python's library Pandas there are a few ways to reshape a dataframe which is in wide form into a dataframe in long/tidy form. Here, we will discuss converting data from a wide form into a long-form using the pandas function s
4 min read
Reshaping Pandas Dataframes using Melt And Unmelt Pandas is an open-source, BSD-licensed library written in Python Language. Pandas provide high performance, fast, easy to use data structures and data analysis tools for manipulating numeric data and time series. Pandas is built on the Numpy library and written in languages like Python, Cython, and
3 min read
Reshape a Pandas DataFrame using stack,unstack and melt method Pandas use various methods to reshape the dataframe and series. Reshaping a Pandas DataFrame is a common operation to transform data structures for better analysis and visualization. The stack method pivots columns into rows, creating a multi-level index Series. Conversely, the unstack method revers
5 min read
Manipulating DataFrames with Pandas - Python Before manipulating the dataframe with pandas we have to understand what is data manipulation. The data in the real world is very unpleasant & unordered so by performing certain operations we can make data understandable based on one's requirements, this process of converting unordered data into
4 min read
How to Pretty Print an Entire Pandas Series or DataFrame? In this article, we are going to see how to Pretty Print the entire pandas Series / Dataframe. Â There are various pretty print options are available for use with this method. Here we will discuss 3 ways to Pretty Print the entire Pandas Dataframe: Use pd.set_options() methodUse pd.option_context() m
3 min read
Joining two Pandas DataFrames using merge() The merge() function is designed to merge two DataFrames based on one or more columns with matching values. The basic idea is to identify columns that contain common data between the DataFrames and use them to align rows. Let's understand the process of joining two pandas DataFrames using merge(), e
4 min read