How to Add a Column to a Polars DataFrame Using .with_columns()
Last Updated :
02 Sep, 2024
The .with_columns()
method in Polars allows us to add one or more columns to a DataFrame. Unlike traditional methods that modify the DataFrame in place, .with_columns()
returns a new DataFrame with the added columns, preserving immutability. This method is highly versatile, allowing us to create new columns based on existing ones, use constant values, or even apply complex transformations.
In this article, we’ll explore different methods to add a new column to a Polars DataFrame using .with_columns().
Install Polars
We first make sure that we have Polars installed in our system.
pip install polars
Basic Usage of .with_columns()
To get started, let's first understand the syntax of the .with_columns()
method.
In this example, we created a new column "
Age_in_5_Years
"
by adding 5 to the "
Age
"
column. The .with_columns()
method is passed a list of expressions, each representing a column to be added.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35]
})
# Add a new column
new_df = df.with_columns([
(pl.col("Age") + 5).alias("Age_in_5_Years")
])
print(new_df)
Output
Add column to Polars dataframe using .with_columns()Example 1: Adding a Constant Value Column
One of the simplest use cases for .with_columns()
is adding a column with a constant value. This is useful when we need to add metadata or a static identifier to each row.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Product": ["A", "B", "C"],
"Price": [100, 150, 200]
})
# Add a constant value column
new_df = df.with_columns([
pl.lit("₹").alias("Currency")
])
print(new_df)
Output
Adding a constant value to polars dataframeHere, we added a "
Currency
"
column with a constant value "
₹"
.
Example 2: Creating a Column Based on Multiple Existing Columns
We can create a new column by performing operations on multiple existing columns. For instance, let's create a "
Total_Cost
"
column by multiplying the "
Price
"
by a "
Quantity
"
column.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Product": ["A", "B", "C"],
"Price": [100, 150, 200]
})
# Add a conditional column
new_df = df.with_columns([
pl.when(pl.col("Price") > 150)
.then(pl.lit("Expensive"))
.otherwise(pl.lit("Affordable"))
.alias("Category")
])
print(new_df)
Output
Adding column from existing ones.This example demonstrates how to create a "
Total_Cost
"
column by multiplying "
Price
"
and "
Quantity
"
.
Example 3: Conditional Column Creation
The .with_columns()
method allows us to create columns conditionally based on existing data. Let's create a column "
Category
"
that categorizes products as "Expensive" if their price is above 150 and "Affordable" otherwise.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Product": ["A", "B", "C"],
"Price": [100, 150, 200]
})
# Add a conditional column
new_df = df.with_columns([
pl.when(pl.col("Price") > 150)
.then("Expensive")
.otherwise("Affordable")
.alias("Category")
])
print(new_df)
Output
Adding Colum based on condition in Polars DataframeHere, we categorized products based on their price.
Example 4: Adding Multiple Columns Simultaneously
We can add multiple columns in one go using the .with_columns()
method by passing a list of expressions.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Product": ["A", "B", "C"],
"Price": [100, 150, 200]
})
# Add multiple columns
new_df = df.with_columns([
(pl.col("Price") * 1.1).alias("Price_with_Tax"),
(pl.col("Price") + 50).alias("Discounted_Price")
])
print(new_df)
Output
add multiple columns in one go in Polars DataframeThis example shows how to add multiple columns: "
Price_with_Tax
"
and "
Discounted_Price
"
.
Example 5: Adding a Column with a Custom Function
Sometimes, We may need to add a column based on a custom function. We can achieve this using the .
map_elements()
method inside .with_columns()
.
Python
import polars as pl
# Create a sample DataFrame
df = pl.DataFrame({
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35]
})
# Define a custom function
def age_category(age):
if age < 30:
return "Young"
else:
return "Mature"
# Add a column using a custom function
new_df = df.with_columns([
pl.col("Age").map_elements(age_category).alias("Category")
])
print(new_df)
Output
Add column to polars dataframe using a functionThis example adds a "Category"
column based on a custom function that categorizes people by age.
Conclusion
The .with_columns()
method in Polars is a powerful and flexible way to add new columns to a DataFrame. Whether we are adding constant values, performing calculations, creating conditional columns, or applying custom functions, .with_columns()
provides an intuitive and efficient interface for DataFrame manipulation. With the examples provided, we can now confidently add columns to our Polars DataFrames in a variety of scenarios.
Similar Reads
How to add a new column to a PySpark DataFrame ?
In this article, we will discuss how to add a new column to PySpark Dataframe. Create the first data frame for demonstration: Here, we will be creating the sample data frame which we will be used further to demonstrate the approach purpose. Python3 # importing module import pyspark # importing spark
9 min read
How to add a constant column in a PySpark DataFrame?
In this article, we are going to see how to add a constant column in a PySpark Dataframe. It can be done in these ways: Using Lit()Using Sql query. Creating Dataframe for demonstration: Python3 # Create a spark session from pyspark.sql import SparkSession from pyspark.sql.functions import lit spark
2 min read
Add column with constant value to pandas dataframe
Prerequisite: Pandas In this article, we will learn how to add a new column with constant value to a Pandas DataFrame. Before that one must be familiar with the following concepts: Pandas DataFrame : Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous tabular arrangement wit
2 min read
Adding two columns to existing PySpark DataFrame using withColumn
In this article, we are going to see how to add two columns to the existing Pyspark Dataframe using WithColumns. WithColumns is used to change the value, convert the datatype of an existing column, create a new column, and many more. Syntax: df.withColumn(colName, col) Returns: A new :class:`DataFr
2 min read
Add New Columns to Polars DataFrame
Polars is a fast DataFrame library implemented in Rust and designed to process large datasets efficiently. It is gaining popularity as an alternative to pandas, especially when working with large datasets or needing higher performance. One common task when working with DataFrames is adding new colum
3 min read
How to add multiple columns to a data.frame in R?
In R Language adding multiple columns to a data.frame can be done in several ways. Below, we will explore different methods to accomplish this, using some practical examples. We will use the base R approach, as well as the dplyr package from the tidyverse collection of packages.Understanding Data Fr
4 min read
How to plot all the columns of a dataframe in R ?
In this article, we will learn how to plot all columns of the DataFrame in R programming language. Dataset in use: x y1 y2 y3 1 1 0.08475635 0.4543649 0 2 2 0.22646034 0.6492529 1 3 3 0.43255650 0.1537271 0 4 4 0.55806524 0.6492887 3 5 5 0.05975527 0.3832137 1 6 6 0.08475635 0.4543649 0 7 7 0.226460
5 min read
How to append a whole dataframe to a CSV in R ?
A data frame in R programming language is a tabular arrangement of rows and columns arranged in the form of a table. A CSV file also contains data stored together to form rows stacked together. Content can be read from and written to the CSV file. Base R contains multiple methods to work with these
3 min read
PySpark dataframe add column based on other columns
In this article, we are going to see how to add columns based on another column to the Pyspark Dataframe. Creating Dataframe for demonstration: Here we are going to create a dataframe from a list of the given dataset. Python3 # Create a spark session from pyspark.sql import SparkSession spark = Spar
2 min read
How to Add a Column with Numerical Value in Polars
Polars is a high-performance DataFrame library written in Rust with Python bindings that offers a fast and efficient way to handle large datasets. In this article, we'll discuss how to add a column with numerical values to a Polars DataFrame, which is similar to operations in pandas but optimized fo
4 min read