Combining columns in pandas dataframe allows data manipulation and transformation making easier to analyze and visualize data. For instance, if you have a DataFrame with separate columns for first and last names- you can combine them into a single "Full Name" column. This can be achieved using various methods in Pandas, such as the +
operator, str.cat()
, and apply()
functions.
Method 1: Concatenating Columns (String Columns)
Python
import pandas as pd
df = pd.DataFrame({'FirstName': ['John', 'Jane'], 'LastName': ['Smith', 'root']})
# Combine columns into a new column
df['FullName'] = df['FirstName'] + ' ' + df['LastName']
print(df)
Output FirstName LastName FullName
0 John Smith John Smith
1 Jane root Jane root
Method 2: Combining Numeric Columns (Mathematical Operations)
You can also combine numeric columns by performing arithmetic operations. For instance, you may want to calculate the total compensation of an employee by adding their Salary and a Bonus column (if present).
- Perform arithmetic operations like addition, subtraction, multiplication, etc.
Python
import pandas as pd
df = pd.DataFrame({
'Salary': [50000, 60000, 70000],
'Bonus': [5000, 6000, 7000]
})
# Combine columns by performing arithmetic operations
df['Total Compensation'] = df['Salary'] + df['Bonus']
df['Salary After Tax'] = df['Salary'] - df['Salary'] * 0.2
df['Salary Times Bonus'] = df['Salary'] * df['Bonus']
print(df)
Output Salary Bonus Total Compensation Salary After Tax Salary Times Bonus
0 50000 5000 55000 40000.0 250000000
1 60000 6000 66000 480...
Method 3: Using agg()
function
The agg()
function can also be employed to combine multiple columns into one. It provides a way to apply different aggregation functions simultaneously.
Python
import pandas as pd
df = pd.DataFrame({'FirstName': ['John', 'Jane'], 'LastName': ['Doe', 'Smith'], 'Age': [28, 34]})
# Combine columns using agg() with a custom function
df['FullName'] = df[['FirstName', 'LastName']].agg(' '.join, axis=1)
print(df)
Output FirstName LastName Age FullName
0 John Doe 28 John Doe
1 Jane Smith 34 Jane Smith
Method 4: Using apply()
with Lambda Functions
The apply()
function can be utilized alongside a lambda function to combine columns. This method is particularly useful for more complex combinations or when dealing with multiple columns:
- Use apply() with a lambda function to combine columns.
Python
import pandas as pd
df = pd.DataFrame({
'First Name': ['John', 'Jane'],
'Last Name': ['Doe', 'Smith'],
'Age': [28, 34]
})
# First part: Combine 'First Name' and 'Last Name'
df['Message'] = df.apply(lambda row: f"{row['First Name']} {row['Last Name']}", axis=1)
# Second part: Add age-related information
df['Message'] = df['Message'] + df.apply(lambda row: f" is {row['Age']} years old.", axis=1)
print(df)
Output First Name Last Name Age Message
0 John Doe 28 John Doe is 28 years old.
1 Jane Smith 34 Jane Smith is 34 years old.
Method 5: Using map()
The map()
function can be used for combining columns by applying a function to each element of a column. For instance:
Python
import pandas as pd
df = pd.DataFrame({'FirstName': ['John', 'Jane'], 'LastName': ['Doe', 'Smith']})
# Combine columns using apply() with a lambda function
df['FullName'] = df.apply(lambda row: row['FirstName'] + ' ' + row['LastName'], axis=1)
print(df)
Output FirstName LastName FullName
0 John Doe John Doe
1 Jane Smith Jane Smith
Similar Reads
Pandas - All combinations of two columns In this article, we will see how to get the combination of two columns of a DataFrame. First, let's create a sample DataFrame. Code: An example code to create a data frame using dictionary. Python3 # importing pandas module for the # data frame import pandas as pd # creating data frame for student d
1 min read
Pandas Combine Rows In data analysis, you may sometimes need to combine or concatenate rows from multiple DataFrames or within the same DataFrame. This can be useful when you're aggregating data, merging results, or appending new data. Pandas offers several methods to combine rows efficiently. In this article, we'll ex
4 min read
Collapse multiple Columns in Pandas While operating dataframes in Pandas, we might encounter a situation to collapse the columns. Let it be cumulated data of multiple columns or collapse based on some other requirement. Let's see how to collapse multiple columns in Pandas. Following steps are to be followed to collapse multiple column
2 min read
Pandas Combine Dataframe Combining DataFrames in Pandas is a fundamental operation that allows users to merge, concatenate, or join data from multiple sources into a single DataFrame. This article explores the different techniques we can use to combine DataFrames in Pandas, focusing on concatenation, merging and joining.Pyt
3 min read
Pandas Join Dataframes Joining DataFrames is a common operation in data analysis, where you combine two or more DataFrames based on common columns or indices. Pandas provides various methods to perform joins, allowing you to merge data in flexible ways. In this article, we will explore how to join DataFrames using methods
4 min read