Open In App

How to apply if condition in Pandas DataFrame

Last Updated : 21 Nov, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

In Pandas DataFrames, applying conditional logic to filter or modify data is a common task. Let’s explore different ways to apply an ‘if condition’ in Pandas DataFrame.

Using apply() with a Lambda Function

We can apply an “if condition” by using apply() with a lambda function. This allows you to apply a custom function row-wise or column-wise to your DataFrame.

Let’s consider a Pandas DataFrame where we have employee data and we want to categorize employees based on their years of experience.

Python
import pandas as pd

# Sample DataFrame
data = {'Name': ['John', 'Sophia', 'Daniel', 'Emma'],
        'Experience': [5, 8, 3, 10]}

df = pd.DataFrame(data)
print("Original Dataset")
display(df)

# Apply if condition using lambda function
df['Category'] = df['Experience'].apply(lambda x: 'Senior' if x >= 5 else 'Junior')

print("Dataset with 'Senior'and 'Junior' Category")
display(df)

Output:

apply-function-to-apply-if-condition-in-pandas-dataframe

Applying ‘if condition’ to classify the ‘Experience’ column into ‘Senior’ and ‘Junior’ categories

Using np.where() for Conditional Column Assignment

For conditional logic, np.where() is often faster than apply() and can be used to return one value when the condition is true, and another when it’s false. This method is vectorized, making it more efficient for large DataFrames..

In this example, we will categorize students based on their exam scores into “Passed” and “Failed.”

Python
import numpy as np

# Sample DataFrame with student scores
data = {'Student': ['Alice', 'Bob', 'Charlie', 'David'],
        'Score': [75, 40, 85, 60]}

df = pd.DataFrame(data)

# Apply condition using np.where()
df['Result'] = np.where(df['Score'] >= 50, 'Passed', 'Failed')

print(df)

Output:

   Student  Score  Result
0 Alice 75 Passed
1 Bob 40 Failed
2 Charlie 85 Passed
3 David 60 Passed

Using loc[] for Applying ‘If Condition’

loc[] method is useful for conditionally selecting rows based on specified criteria. In this example, we’ll classify products as “Expensive” or “Affordable” based on their price.

Python
# Sample DataFrame with product prices
data = {'Product': ['Laptop', 'Phone', 'Tablet', 'Smartwatch'],
        'Price': [1200, 600, 300, 200]}

df = pd.DataFrame(data)

# Apply condition using loc for price classification
df.loc[df['Price'] > 500, 'Price_Category'] = 'Expensive'
df.loc[df['Price'] <= 500, 'Price_Category'] = 'Affordable'

print(df)

Output:

      Product  Price Price_Category
0 Laptop 1200 Expensive
1 Phone 600 Expensive
2 Tablet 300 Affordable
3 Smartwatch 200 Affordable

Using query() for Filtering Data Based on Multiple Conditions

query() method is a great tool for filtering rows based on multiple conditions. In this example, we filter employees based on both their years of experience and salary.

Python
# Sample DataFrame with employee details
data = {'Employee': ['John', 'Sophia', 'Daniel', 'Emma'],
        'Experience': [6, 2, 7, 4],
        'Salary': [55000, 40000, 65000, 35000]}

df = pd.DataFrame(data)

# Use query method to filter employees with experience >= 5 and salary > 50000
filtered_df = df.query('Experience >= 5 and Salary > 50000')

print(filtered_df)

Output:

  Employee  Experience  Salary
0 John 6 55000
2 Daniel 7 65000

Using mask() for Conditional Assignment

The mask() method is used to replace values where the condition is True. It is useful when you want to make changes based on a condition while keeping the rest of the data intact. Let’s use it to flag negative values in a financial dataset as “Invalid.”

Python
# Sample DataFrame with financial data
data = {'Account': ['A1', 'A2', 'A3', 'A4'],
        'Balance': [1500, -250, 3200, -500]}

df = pd.DataFrame(data)

# Use mask to flag negative balances as "Invalid"
df['Status'] = df['Balance'].mask(df['Balance'] < 0, 'Invalid')

print(df)

Output:

  Account  Balance   Status
0 A1 1500 1500
1 A2 -250 Invalid
2 A3 3200 3200
3 A4 -500 Invalid

In this case, negative balances are flagged as “Invalid.”

Applying IF condition On Strings

Let’s consider a task to apply conditional logic: if the name is equal to “Ria,” we assign the value “Found”; otherwise, we assign “Not Found.”

Python
from pandas import DataFrame

# Creating a DataFrame with a list of names
names = {'First_name': ['Hanah', 'Ria', 'Jay', 'Bholu', 'Sachin']}
df = DataFrame(names, columns =['First_name'])

# Applying the condition: If name is 'Ria', assign 'Found'; otherwise, 'Not Found'
df.loc[df['First_name'] == 'Ria', 'Status'] = 'Found'
df.loc[df['First_name'] != 'Ria', 'Status'] = 'Not Found'

# Print the resulting DataFrame
print(df)

Output:

  First_name     Status
0 Hanah Not Found
1 Ria Found
2 Jay Not Found
3 Bholu Not Found
4 Sachin Not Found


Next Article
Practice Tags :

Similar Reads