How to Rename Columns in Polars

Key Insights

Polars offers multiple column renaming strategies: rename() for explicit mappings, alias() for expression-based transformations, and direct assignment for bulk replacements
The alias() method shines when you’re already transforming data, letting you rename columns as part of your select or aggregation pipeline without extra steps
Programmatic renaming with lambda functions handles real-world messiness like inconsistent casing, spaces in column names, or adding standardized prefixes across dozens of columns

Introduction

Column renaming sounds trivial until you’re staring at a dataset with columns named Customer ID, customer_id, CUSTOMER ID, and cust_id that all need to become customer_id. Or you’ve inherited a CSV export where some analyst thought Q1 2024 Revenue ($) was a reasonable column name.

Polars, the Rust-powered DataFrame library that’s rapidly becoming the go-to choice for performance-critical Python data work, handles column renaming with the same philosophy it applies to everything else: give you multiple approaches optimized for different scenarios, and make them all fast.

This guide covers every practical method for renaming columns in Polars, from simple one-off changes to programmatic transformations across hundreds of columns. I’ll show you when to use each approach and how they behave differently in eager versus lazy execution modes.

Using the rename() Method

The rename() method is your workhorse for explicit column renaming. Pass it a dictionary mapping old names to new names, and Polars handles the rest.

import polars as pl

# Create a sample DataFrame
df = pl.DataFrame({
    "firstName": ["Alice", "Bob", "Charlie"],
    "lastName": ["Smith", "Jones", "Brown"],
    "age_years": [30, 25, 35]
})

# Rename a single column
df_renamed = df.rename({"firstName": "first_name"})
print(df_renamed)

Output:

shape: (3, 3)
┌────────────┬──────────┬───────────┐
│ first_name ┆ lastName ┆ age_years │
│ ---        ┆ ---      ┆ ---       │
│ str        ┆ str      ┆ i64       │
╞════════════╪══════════╪═══════════╡
│ Alice      ┆ Smith    ┆ 30        │
│ Bob        ┆ Jones    ┆ 25        │
│ Charlie    ┆ Brown    ┆ 35        │
└────────────┴──────────┴───────────┘

For multiple columns, just add more key-value pairs to the dictionary:

df_cleaned = df.rename({
    "firstName": "first_name",
    "lastName": "last_name",
    "age_years": "age"
})
print(df_cleaned)

Output:

shape: (3, 3)
┌────────────┬───────────┬─────┐
│ first_name ┆ last_name ┆ age │
│ ---        ┆ ---       ┆ --- │
│ str        ┆ str       ┆ i64 │
╞════════════╪═══════════╪═════╡
│ Alice      ┆ Smith     ┆ 30  │
│ Bob        ┆ Jones     ┆ 25  │
│ Charlie    ┆ Brown     ┆ 35  │
└────────────┴───────────┴─────┘

The rename() method returns a new DataFrame by default—Polars DataFrames are immutable. If you try to rename a column that doesn’t exist, Polars raises a ColumnNotFoundError, which is actually helpful for catching typos early.

Using alias() in Expressions

When you’re already selecting or transforming columns, alias() lets you rename as part of the operation. This is idiomatic Polars and often cleaner than a separate rename() call.

df = pl.DataFrame({
    "product_name": ["Widget", "Gadget", "Gizmo"],
    "unit_price": [10.0, 25.0, 15.0],
    "quantity_sold": [100, 50, 75]
})

# Rename while selecting
result = df.select(
    pl.col("product_name").alias("product"),
    pl.col("unit_price").alias("price"),
    pl.col("quantity_sold").alias("units")
)
print(result)

The real power of alias() emerges when you’re computing new columns:

result = df.select(
    pl.col("product_name").alias("product"),
    (pl.col("unit_price") * pl.col("quantity_sold")).alias("total_revenue"),
    pl.col("quantity_sold").rank().alias("sales_rank")
)
print(result)

Output:

shape: (3, 3)
┌─────────┬───────────────┬────────────┐
│ product ┆ total_revenue ┆ sales_rank │
│ ---     ┆ ---           ┆ ---        │
│ str     ┆ f64           ┆ u32        │
╞═════════╪═══════════════╪════════════╡
│ Widget  ┆ 1000.0        ┆ 3          │
│ Gadget  ┆ 1250.0        ┆ 1          │
│ Gizmo   ┆ 1125.0        ┆ 2          │
└─────────┴───────────────┴────────────┘

Use alias() when you’re already in an expression context. Use rename() when you just need to change names without other transformations.

Renaming All Columns at Once

Sometimes you need to replace all column names wholesale—maybe you’re reading a headerless CSV and assigning meaningful names, or conforming to a strict schema.

df = pl.DataFrame({
    "column_1": [1, 2, 3],
    "column_2": ["a", "b", "c"],
    "column_3": [True, False, True]
})

# Replace all column names at once
df.columns = ["id", "category", "is_active"]
print(df)

Output:

shape: (3, 3)
┌─────┬──────────┬───────────┐
│ id  ┆ category ┆ is_active │
│ --- ┆ ---      ┆ ---       │
│ i64 ┆ str      ┆ bool      │
╞═════╪══════════╪═══════════╡
│ 1   ┆ a        ┆ true      │
│ 2   ┆ b        ┆ false     │
│ 3   ┆ c        ┆ true      │
└─────┴──────────┴───────────┘

Important caveat: This is an in-place mutation, which is unusual for Polars. The list length must exactly match the number of columns, or you’ll get an error.

For a more functional approach that returns a new DataFrame, use rename() with a complete mapping:

new_names = ["id", "category", "is_active"]
old_names = df.columns
df_renamed = df.rename(dict(zip(old_names, new_names)))

Programmatic Renaming with Functions

Real datasets have messy column names. You need programmatic solutions that can handle patterns, not just explicit mappings.

Polars’ rename() method accepts a callable that receives each column name and returns the new name:

df = pl.DataFrame({
    "First Name": ["Alice", "Bob"],
    "Last Name": ["Smith", "Jones"],
    "Email Address": ["alice@example.com", "bob@example.com"]
})

# Convert to snake_case
def to_snake_case(name: str) -> str:
    return name.lower().replace(" ", "_")

df_clean = df.rename(to_snake_case)
print(df_clean)

Output:

shape: (2, 3)
┌────────────┬───────────┬───────────────────┐
│ first_name ┆ last_name ┆ email_address     │
│ ---        ┆ ---       ┆ ---               │
│ str        ┆ str       ┆ str               │
╞════════════╪═══════════╪═══════════════════╡
│ Alice      ┆ Smith     ┆ alice@example.com │
│ Bob        ┆ Jones     ┆ bob@example.com   │
└────────────┴───────────┴───────────────────┘

Lambda functions work great for simpler transformations:

# Add prefix to all columns
df_prefixed = df.rename(lambda col: f"user_{col.lower().replace(' ', '_')}")

# Remove common suffix
df = pl.DataFrame({"name_col": [1], "age_col": [2], "city_col": [3]})
df_clean = df.rename(lambda col: col.removesuffix("_col"))

For more complex patterns, use regex:

import re

df = pl.DataFrame({
    "Q1 2024 Revenue ($)": [1000],
    "Q2 2024 Revenue ($)": [1500],
    "Q3 2024 Revenue ($)": [1200]
})

def clean_column_name(name: str) -> str:
    # Extract quarter and convert to clean format
    match = re.match(r"Q(\d) (\d{4})", name)
    if match:
        return f"revenue_q{match.group(1)}_{match.group(2)}"
    return name.lower().replace(" ", "_")

df_clean = df.rename(clean_column_name)
print(df_clean.columns)
# Output: ['revenue_q1_2024', 'revenue_q2_2024', 'revenue_q3_2024']

Renaming in Lazy vs Eager Mode

Polars’ lazy execution mode builds a query plan that gets optimized before execution. Column renaming works in lazy mode, but there are nuances worth understanding.

# Create a LazyFrame
lf = pl.LazyFrame({
    "old_name": [1, 2, 3],
    "another_old": ["a", "b", "c"]
})

# Chain operations including rename
result = (
    lf
    .rename({"old_name": "id", "another_old": "category"})
    .filter(pl.col("id") > 1)
    .select(pl.col("id"), pl.col("category").str.to_uppercase().alias("category_upper"))
    .collect()  # Execute the query
)
print(result)

Output:

shape: (2, 2)
┌─────┬────────────────┐
│ id  ┆ category_upper │
│ --- ┆ ---            │
│ i64 ┆ str            │
╞═════╪════════════════╡
│ 2   ┆ B              │
│ 3   ┆ C              │
└─────┴────────────────┘

Key considerations for lazy mode:

Rename early in your pipeline if subsequent operations reference the new column names
The query optimizer handles it efficiently—renaming doesn’t create intermediate DataFrames
Use alias() in expressions within lazy pipelines for cleaner code

# This is idiomatic for lazy pipelines
result = (
    pl.scan_csv("data.csv")
    .select(
        pl.col("CustomerID").alias("customer_id"),
        pl.col("OrderTotal").alias("order_total")
    )
    .group_by("customer_id")
    .agg(pl.col("order_total").sum().alias("total_spend"))
    .collect()
)

The direct column assignment (df.columns = [...]) doesn’t work on LazyFrames because they don’t materialize data until collect() is called.

Conclusion

Polars gives you the right tool for each column renaming scenario:

Method	Best For	Works with LazyFrame?
`rename({"old": "new"})`	Explicit, targeted renames	Yes
`alias()` in expressions	Renaming during transformations	Yes
`df.columns = [...]`	Bulk replacement of all names	No (eager only)
`rename(function)`	Programmatic pattern-based renaming	Yes

For most production code, I recommend alias() within expression contexts and rename() with a callable for standardization tasks. The explicit dictionary approach works fine for one-off scripts, but programmatic renaming scales better when your column naming conventions evolve.

One final tip: establish column naming conventions early in your data pipeline. Whether you prefer snake_case, camelCase, or something else, apply the transformation immediately after data ingestion. Your future self—and anyone else reading your code—will thank you.

Introduction

Using the rename() Method

Using alias() in Expressions

Renaming All Columns at Once

Programmatic Renaming with Functions

Renaming in Lazy vs Eager Mode

Conclusion

Liked this? There's more.

Similar Articles