Python tutorial on Polars, a fast DataFrame library for data manipulation and analysis with practical examples.
last modified March 1, 2025
Polars is a fast DataFrame library in Python designed for efficient data manipulation and analysis. It is built for performance, leveraging Rust under the hood. This tutorial introduces Polars with practical examples.
Polars supports lazy and eager execution modes, making it ideal for large datasets. It provides a Pandas-like API with additional optimizations.
This example shows how to create a Polars DataFrame from a dictionary.
create_df.py
import polars as pl
data = { ‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’], ‘Age’: [25, 30, 35], ‘City’: [‘New York’, ‘Los Angeles’, ‘Chicago’] }
df = pl.DataFrame(data) print(df)
The pl.DataFrame function creates a DataFrame from a dictionary. This is the simplest way to initialize a Polars DataFrame.
This example demonstrates reading a CSV file into a Polars DataFrame.
read_csv.py
import polars as pl
df = pl.read_csv(‘data.csv’) print(df)
The pl.read_csv function reads a CSV file into a DataFrame. Polars supports various file formats, including Parquet and JSON.
This example shows how to filter rows based on a condition.
filter_rows.py
import polars as pl
df = pl.DataFrame({ ‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’], ‘Age’: [25, 30, 35] })
filtered_df = df.filter(pl.col(‘Age’) > 30) print(filtered_df)
The filter method filters rows where the ‘Age’ column is greater than 30. Polars uses expressions for efficient filtering.
This example demonstrates selecting specific columns from a DataFrame.
select_columns.py
import polars as pl
df = pl.DataFrame({ ‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’], ‘Age’: [25, 30, 35], ‘City’: [‘New York’, ‘Los Angeles’, ‘Chicago’] })
selected_df = df.select([‘Name’, ‘City’]) print(selected_df)
The select method selects specific columns from the DataFrame. This is useful for focusing on relevant data.
This example shows how to add a new column to a DataFrame.
add_column.py
import polars as pl
df = pl.DataFrame({ ‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’], ‘Age’: [25, 30, 35] })
df = df.with_column((pl.col(‘Age’) * 2).alias(‘DoubleAge’)) print(df)
The with_column method adds a new column ‘DoubleAge’, which is twice the ‘Age’ column. Polars supports column-wise operations.
This example demonstrates grouping data and calculating aggregate statistics.
group_agg.py
import polars as pl
df = pl.DataFrame({ ‘City’: [‘New York’, ‘Los Angeles’, ‘New York’, ‘Chicago’], ‘Sales’: [100, 200, 150, 300] })
grouped_df = df.groupby(‘City’).agg([ pl.col(‘Sales’).sum().alias(‘TotalSales’) ]) print(grouped_df)
The groupby and agg methods group data by ‘City’ and calculate the total sales for each city. Polars supports efficient grouping.
This example shows how to sort a DataFrame by a specific column.
sort_data.py
import polars as pl
df = pl.DataFrame({ ‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’], ‘Age’: [25, 30, 35] })
sorted_df = df.sort(‘Age’, reverse=True) print(sorted_df)
The sort method sorts the DataFrame by the ‘Age’ column in descending order. Polars provides efficient sorting algorithms.
This example demonstrates lazy execution for optimizing performance.
lazy_execution.py
import polars as pl
df = pl.DataFrame({ ‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’], ‘Age’: [25, 30, 35] })
lazy_df = df.lazy().filter(pl.col(‘Age’) > 30).collect() print(lazy_df)
The lazy method enables lazy execution, which optimizes queries before execution. Use collect to trigger computation.
Use Lazy Execution: Optimize queries with lazy execution for large datasets.
Leverage Expressions: Use Polars expressions for efficient data manipulation.
Choose Appropriate Data Types: Use correct data types to improve performance.
Profile Queries: Profile queries to identify bottlenecks.
In this article, we have explored the basics of Polars with practical examples.
My name is Jan Bodnar, and I am a passionate programmer with extensive programming experience. I have been writing programming articles since 2007. To date, I have authored over 1,400 articles and 8 e-books. I possess more than ten years of experience in teaching programming.
List all Polars tutorials.