Open In App

Append or Concatenate Two DataFrames in Python Polars

Last Updated : 21 Aug, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Polars is a fast Data Frame library implemented in Rust, providing efficient ways to work with large datasets. Whether we need to append rows or concatenate columns, Polars offers multiple methods to handle these tasks effectively.

Setting Up Your Environment

Before diving into the examples, ensure you have Polars installed. If not, you can install it using pip:

pip install polars

Basic DataFrame Creation

Let’s create a simple data frame to demonstrate filtering:

Python
import polars as pl

# Create a DataFrame
df = pl.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [24, 30, 22],
    "gender": ["F", "M", "M"],
    "city": ["New York", "Dallas", "Chicago"]
})

print(df)

Output

 shape: (3, 4)
┌─────────┬─────┬────────┬─────────┐
│ name │ age │ gender │ city │
│ --- │ --- │ --- │ --- │
│ str │ i64 │ str │ str │
├─────────┼─────┼────────┼─────────┤
│ Alice │ 24 │ F │ New York│
│ Bob │ 30 │ M │ Dallas │
│ Charlie │ 22 │ M │ Chicago │
└─────────┴─────┴────────┴─────────┘

Loading Data into Polars DataFrame

We have a data.csv file and let's load some data into a Polars DataFrame from CSV file.

Python
import polars

df = polars.read_csv('data.csv')
print(df)

Output:

shape: (10, 4)
┌─────────┬─────┬────────┬────────────┐
│ name │ age │ gender │ city │
│ --- │ --- │ --- │ --- │
│ str │ i64 │ str │ str │
├─────────┼─────┼────────┼────────────┤
│ Alice │ 24 │ F │ New York │
│ Bob │ 30 │ M │ Dallas │
│ Charlie │ 22 │ M │ Chicago │
│ David │ 25 │ M │ Dallas │
│ Eve │ 28 │ F │ Phoenix │
│ Frank │ 33 │ M │ New York │
│ Grace │ 27 │ F │ Chicago │
│ Hank │ 27 │ M │ Phoenix │
│ Ivy │ 26 │ F │ Dallas │
│ Jack │ 31 │ M │ Chicago │
└─────────┴─────┴────────┴────────────┘

Methods to Append or Concatenate DataFrames in Polars

  • Using pl.concat for Concatenation
  • Using vstack for Appending Rows

Using pl.concat for Concatenation

The pl.concat function in Polars allows us to concatenate multiple DataFrames either vertically (by adding rows) or horizontally (by adding columns).

Example: Concatenating Two DataFrames Vertically
Python
import polars as pl

# Create two DataFrames
df1 = pl.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [24, 30, 22],
    "gender": ["F", "M", "M"],
    "city": ["New York", "Dallas", "Chicago"]
})

df2 = pl.DataFrame({
    "name": ["David", "Eve", "Frank"],
    "age": [25, 28, 33],
    "gender": ["M", "F", "M"],
    "city": ["Dallas", "Phoenix", "New York"]
})

# Concatenate the DataFrames vertically
df_combined = pl.concat([df1, df2])

print(df_combined)

Output:

Screenshot-2024-08-20-150703
Using pl.concat for Concatenation

Example: Concatenating Two DataFrames Horizontally

Python
import polars as pl

# Create two DataFrames
df1 = pl.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [24, 30, 22],
})

df2 = pl.DataFrame({
    "gender": ["F", "M", "M"],
    "city": ["New York", "Dallas", "Chicago"]
})

# Concatenate the DataFrames horizontally
df_combined = pl.concat([df1, df2], how="horizontal")

print(df_combined)

Output

Screenshot-2024-08-20-150207
Combine DF in Polar

Using vstack for Appending Rows

The vstack method is used to append one DataFrame to another vertically (row-wise).

Example: Appending Two DataFrames in Python Polars

Python
import polars as pl

# Create two DataFrames
df1 = pl.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [24, 30, 22],
    "gender": ["F", "M", "M"],
    "city": ["New York", "Dallas", "Chicago"]
})

df2 = pl.DataFrame({
    "name": ["David", "Eve", "Frank"],
    "age": [25, 28, 33],
    "gender": ["M", "F", "M"],
    "city": ["Dallas", "Phoenix", "New York"]
})

# Append df2 to df1
df_combined = df1.vstack(df2)

print(df_combined)

Output

Screenshot-2024-08-20-150348
Combine DF in Polar using Vstack

Conclusion

Polars provides efficient and easy-to-use functions for appending or concatenating DataFrames. We can use pl.concat for both vertical and horizontal concatenation, or vstack specifically for appending DataFrames vertically. Additionally, Polars allows us to load data from CSV files seamlessly, making it a powerful tool for handling large datasets effectively.


Next Article
Article Tags :
Practice Tags :

Similar Reads