How to apply if condition in Pandas DataFrame
Last Updated :
21 Nov, 2024
In Pandas DataFrames, applying conditional logic to filter or modify data is a common task. Let’s explore different ways to apply an ‘if condition’ in Pandas DataFrame.
Using apply() with a Lambda Function
We can apply an “if condition” by using apply() with a lambda function. This allows you to apply a custom function row-wise or column-wise to your DataFrame.
Let’s consider a Pandas DataFrame where we have employee data and we want to categorize employees based on their years of experience.
Python
import pandas as pd
# Sample DataFrame
data = {'Name': ['John', 'Sophia', 'Daniel', 'Emma'],
'Experience': [5, 8, 3, 10]}
df = pd.DataFrame(data)
print("Original Dataset")
display(df)
# Apply if condition using lambda function
df['Category'] = df['Experience'].apply(lambda x: 'Senior' if x >= 5 else 'Junior')
print("Dataset with 'Senior'and 'Junior' Category")
display(df)
Output:

Applying ‘if condition’ to classify the ‘Experience’ column into ‘Senior’ and ‘Junior’ categories
Using np.where()
for Conditional Column Assignment
For conditional logic, np.where() is often faster than apply() and can be used to return one value when the condition is true, and another when it’s false. This method is vectorized, making it more efficient for large DataFrames..
In this example, we will categorize students based on their exam scores into “Passed” and “Failed.”
Python
import numpy as np
# Sample DataFrame with student scores
data = {'Student': ['Alice', 'Bob', 'Charlie', 'David'],
'Score': [75, 40, 85, 60]}
df = pd.DataFrame(data)
# Apply condition using np.where()
df['Result'] = np.where(df['Score'] >= 50, 'Passed', 'Failed')
print(df)
Output:
Student Score Result
0 Alice 75 Passed
1 Bob 40 Failed
2 Charlie 85 Passed
3 David 60 Passed
Using loc[]
for Applying ‘If Condition’
loc[]
method is useful for conditionally selecting rows based on specified criteria. In this example, we’ll classify products as “Expensive” or “Affordable” based on their price.
Python
# Sample DataFrame with product prices
data = {'Product': ['Laptop', 'Phone', 'Tablet', 'Smartwatch'],
'Price': [1200, 600, 300, 200]}
df = pd.DataFrame(data)
# Apply condition using loc for price classification
df.loc[df['Price'] > 500, 'Price_Category'] = 'Expensive'
df.loc[df['Price'] <= 500, 'Price_Category'] = 'Affordable'
print(df)
Output:
Product Price Price_Category
0 Laptop 1200 Expensive
1 Phone 600 Expensive
2 Tablet 300 Affordable
3 Smartwatch 200 Affordable
Using query() for Filtering Data Based on Multiple Conditions
query() method is a great tool for filtering rows based on multiple conditions. In this example, we filter employees based on both their years of experience and salary.
Python
# Sample DataFrame with employee details
data = {'Employee': ['John', 'Sophia', 'Daniel', 'Emma'],
'Experience': [6, 2, 7, 4],
'Salary': [55000, 40000, 65000, 35000]}
df = pd.DataFrame(data)
# Use query method to filter employees with experience >= 5 and salary > 50000
filtered_df = df.query('Experience >= 5 and Salary > 50000')
print(filtered_df)
Output:
Employee Experience Salary
0 John 6 55000
2 Daniel 7 65000
Using mask() for Conditional Assignment
The mask() method is used to replace values where the condition is True. It is useful when you want to make changes based on a condition while keeping the rest of the data intact. Let’s use it to flag negative values in a financial dataset as “Invalid.”
Python
# Sample DataFrame with financial data
data = {'Account': ['A1', 'A2', 'A3', 'A4'],
'Balance': [1500, -250, 3200, -500]}
df = pd.DataFrame(data)
# Use mask to flag negative balances as "Invalid"
df['Status'] = df['Balance'].mask(df['Balance'] < 0, 'Invalid')
print(df)
Output:
Account Balance Status
0 A1 1500 1500
1 A2 -250 Invalid
2 A3 3200 3200
3 A4 -500 Invalid
In this case, negative balances are flagged as “Invalid.”
Applying IF condition On Strings
Let’s consider a task to apply conditional logic: if the name is equal to “Ria,” we assign the value “Found”; otherwise, we assign “Not Found.”
Python
from pandas import DataFrame
# Creating a DataFrame with a list of names
names = {'First_name': ['Hanah', 'Ria', 'Jay', 'Bholu', 'Sachin']}
df = DataFrame(names, columns =['First_name'])
# Applying the condition: If name is 'Ria', assign 'Found'; otherwise, 'Not Found'
df.loc[df['First_name'] == 'Ria', 'Status'] = 'Found'
df.loc[df['First_name'] != 'Ria', 'Status'] = 'Not Found'
# Print the resulting DataFrame
print(df)
Output:
First_name Status
0 Hanah Not Found
1 Ria Found
2 Jay Not Found
3 Bholu Not Found
4 Sachin Not Found
Similar Reads
Ways to apply an if condition in Pandas DataFrame
Generally on a Pandas DataFrame the if condition can be applied either column-wise, row-wise, or on an individual cell basis. The further document illustrates each of these with examples. First of all we shall create the following DataFrame : C/C++ Code # importing pandas as pd import pandas as pd #
3 min read
Ways to apply an if condition in Pandas DataFrame
In Pandas DataFrames, applying conditional logic to filter or modify data is a common task. Let's explore different ways to apply an 'if condition' in Pandas DataFrame. Using apply() with a Lambda FunctionWe can apply an "if condition" by using apply() with a lambda function. This allows you to appl
4 min read
How to convert index in a column of the Pandas dataframe?
Each row in a dataframe (i.e level=0) has an index value i.e value from 0 to n-1 index location and there are many ways to convert these index values into a column in a pandas dataframe. First, let's create a Pandas dataframe. Here, we will create a Pandas dataframe regarding student's marks in a pa
4 min read
Convert Floats to Integers in a Pandas DataFrame
Let us see how to convert float to integer in a Pandas DataFrame. We will be using the astype() method to do this. It can also be done using the apply() method. Convert Floats to Integers in a Pandas DataFrameBelow are the ways by which we can convert floats to integers in a Pandas DataFrame: Using
3 min read
Conditional operation on Pandas DataFrame columns
Suppose you have an online store. The price of the products is updated frequently. While calculating the final price on the product, you check if the updated price is available or not. If not available then you use the last price available. Solution #1: We can use conditional expression to check if
4 min read
How to add Empty Column to Dataframe in Pandas?
In Pandas we add empty columns to a DataFrame to create placeholders for future data or handle missing values. We can assign empty columns using different methods depending on the type of placeholder value we want. In this article, we will see different methods to add empty columns and how each one
2 min read
Filter Pandas Dataframe with multiple conditions
In this article, let's discuss how to filter pandas dataframe with multiple conditions. There are possibilities of filtering data from Pandas dataframe with multiple conditions during the entire software development. Filter Pandas Dataframe with multiple conditionsThe reason is dataframe may be havi
6 min read
Merge two Pandas DataFrames with complex conditions
In this article, we let's discuss how to merge two Pandas Dataframe with some complex conditions. Dataframes in Pandas can be merged using pandas.merge() method. Syntax: pandas.merge(parameters) Returns : A DataFrame of the two merged objects. While working on datasets there may be a need to merge t
4 min read
Selecting rows in pandas DataFrame based on conditions
Letâs see how to Select rows based on some conditions in Pandas DataFrame. Selecting rows based on particular column value using '>', '=', '=', '<=', '!=' operator. Code #1 : Selecting all the rows from the given dataframe in which 'Percentage' is greater than 80 using basic method. [GFGTABS]
6 min read
Applying Lambda functions to Pandas Dataframe
In Python Pandas, we have the freedom to add different functions whenever needed like lambda function, sort function, etc. We can apply a lambda function to both the columns and rows of the Pandas data frame. Syntax: lambda arguments: expression An anonymous function which we can pass in instantly w
6 min read