Open In App

Pandas Combine Columns

Last Updated : 13 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Combining columns in pandas dataframe allows data manipulation and transformation making easier to analyze and visualize data. For instance, if you have a DataFrame with separate columns for first and last names- you can combine them into a single "Full Name" column. This can be achieved using various methods in Pandas, such as the + operator, str.cat(), and apply() functions.

Method 1: Concatenating Columns (String Columns)

Python
import pandas as pd

df = pd.DataFrame({'FirstName': ['John', 'Jane'], 'LastName': ['Smith', 'root']})

# Combine columns into a new column
df['FullName'] = df['FirstName'] + ' ' + df['LastName']
print(df)

Output
  FirstName LastName    FullName
0      John    Smith  John Smith
1      Jane     root   Jane root

Method 2: Combining Numeric Columns (Mathematical Operations)

You can also combine numeric columns by performing arithmetic operations. For instance, you may want to calculate the total compensation of an employee by adding their Salary and a Bonus column (if present).

  • Perform arithmetic operations like addition, subtraction, multiplication, etc.
Python
import pandas as pd

df = pd.DataFrame({
    'Salary': [50000, 60000, 70000],
    'Bonus': [5000, 6000, 7000]
})

# Combine columns by performing arithmetic operations
df['Total Compensation'] = df['Salary'] + df['Bonus']  
df['Salary After Tax'] = df['Salary'] - df['Salary'] * 0.2  
df['Salary Times Bonus'] = df['Salary'] * df['Bonus'] 

print(df)

Output
   Salary  Bonus  Total Compensation  Salary After Tax  Salary Times Bonus
0   50000   5000               55000           40000.0           250000000
1   60000   6000               66000           480...

Method 3: Using agg() function

The agg() function can also be employed to combine multiple columns into one. It provides a way to apply different aggregation functions simultaneously.

Python
import pandas as pd

df = pd.DataFrame({'FirstName': ['John', 'Jane'], 'LastName': ['Doe', 'Smith'], 'Age': [28, 34]})

# Combine columns using agg() with a custom function
df['FullName'] = df[['FirstName', 'LastName']].agg(' '.join, axis=1)
print(df)

Output
  FirstName LastName  Age    FullName
0      John      Doe   28    John Doe
1      Jane    Smith   34  Jane Smith

Method 4: Using apply() with Lambda Functions

The apply() function can be utilized alongside a lambda function to combine columns. This method is particularly useful for more complex combinations or when dealing with multiple columns:

  • Use apply() with a lambda function to combine columns.
Python
import pandas as pd

df = pd.DataFrame({
    'First Name': ['John', 'Jane'],
    'Last Name': ['Doe', 'Smith'],
    'Age': [28, 34]
})
# First part: Combine 'First Name' and 'Last Name'
df['Message'] = df.apply(lambda row: f"{row['First Name']} {row['Last Name']}", axis=1)

# Second part: Add age-related information
df['Message'] = df['Message'] + df.apply(lambda row: f" is {row['Age']} years old.", axis=1)

print(df)

Output
  First Name Last Name  Age                      Message
0       John       Doe   28    John Doe is 28 years old.
1       Jane     Smith   34  Jane Smith is 34 years old.

Method 5: Using map()

The map() function can be used for combining columns by applying a function to each element of a column. For instance:

Python
import pandas as pd

df = pd.DataFrame({'FirstName': ['John', 'Jane'], 'LastName': ['Doe', 'Smith']})

# Combine columns using apply() with a lambda function
df['FullName'] = df.apply(lambda row: row['FirstName'] + ' ' + row['LastName'], axis=1)

print(df)

Output
  FirstName LastName    FullName
0      John      Doe    John Doe
1      Jane    Smith  Jane Smith

Next Article

Similar Reads