How to Filter by Multiple Conditions in Polars

Key Insights

Polars requires explicit parentheses around each condition when combining with &, |, or ~ operators—forgetting them is the most common source of filtering errors
Use pl.all_horizontal() and pl.any_horizontal() for cleaner, more readable code when applying the same logical operation across multiple conditions
Lazy mode with predicate pushdown can dramatically improve performance by filtering data before it’s fully loaded into memory

Introduction

Polars has emerged as the go-to DataFrame library for Python developers who need speed. Built in Rust with a query optimizer, it consistently outperforms pandas by 10-100x on large datasets. But speed means nothing if you can’t express your filtering logic clearly.

Filtering by multiple conditions is something you’ll do constantly—finding users who are active AND subscribed, products that are in stock OR on backorder, transactions that happened this month but NOT from a specific region. Polars gives you several ways to express these conditions, each with different trade-offs for readability and flexibility.

This article covers the practical patterns you’ll actually use, from basic boolean operators to dynamic filter construction for production applications.

Basic Single-Condition Filtering Refresher

Before combining conditions, let’s establish the foundation. Polars uses the .filter() method with column expressions:

import polars as pl

df = pl.DataFrame({
    "name": ["Alice", "Bob", "Charlie", "Diana", "Eve"],
    "age": [28, 35, 42, 31, 25],
    "department": ["Engineering", "Sales", "Engineering", "Marketing", "Sales"],
    "salary": [75000, 65000, 95000, 70000, 55000],
    "is_active": [True, True, False, True, True]
})

# Single condition filter
senior_employees = df.filter(pl.col("age") > 30)
print(senior_employees)

Output:

shape: (3, 5)
┌─────────┬─────┬─────────────┬────────┬───────────┐
│ name    ┆ age ┆ department  ┆ salary ┆ is_active │
│ ---     ┆ --- ┆ ---         ┆ ---    ┆ ---       │
│ str     ┆ i64 ┆ str         ┆ i64    ┆ bool      │
╞═════════╪═════╪═════════════╪════════╪═══════════╡
│ Bob     ┆ 35  ┆ Sales       ┆ 65000  ┆ true      │
│ Charlie ┆ 42  ┆ Engineering ┆ 95000  ┆ false     │
│ Diana   ┆ 31  ┆ Marketing   ┆ 70000  ┆ true      │
└─────────┴─────┴─────────────┴────────┴───────────┘

The pl.col() function creates an expression that references a column. The comparison operator returns a boolean expression that .filter() uses to select rows.

Combining Conditions with Logical Operators

Here’s where most developers hit their first Polars gotcha: you must wrap each condition in parentheses. This isn’t optional—Python’s operator precedence will break your code without them.

AND Conditions with `&`

# Find active employees over 30
result = df.filter(
    (pl.col("age") > 30) & (pl.col("is_active") == True)
)
print(result)

Output:

shape: (2, 5)
┌───────┬─────┬─────────────┬────────┬───────────┐
│ name  ┆ age ┆ department  ┆ salary ┆ is_active │
│ ---   ┆ --- ┆ ---         ┆ ---    ┆ ---       │
│ str   ┆ i64 ┆ str         ┆ i64    ┆ bool      │
╞═══════╪═════╪═════════════╪════════╪═══════════╡
│ Bob   ┆ 35  ┆ Sales       ┆ 65000  ┆ true      │
│ Diana ┆ 31  ┆ Marketing   ┆ 70000  ┆ true      │
└───────┴─────┴─────────────┴────────┴───────────┘

Without parentheses, pl.col("age") > 30 & pl.col("is_active") would evaluate 30 & pl.col("is_active") first due to operator precedence, causing a cryptic error.

OR Conditions with `|`

# Find employees in Engineering OR with salary above 70000
result = df.filter(
    (pl.col("department") == "Engineering") | (pl.col("salary") > 70000)
)
print(result)

Output:

shape: (3, 5)
┌─────────┬─────┬─────────────┬────────┬───────────┐
│ name    ┆ age ┆ department  ┆ salary ┆ is_active │
│ ---     ┆ --- ┆ ---         ┆ ---    ┆ ---       │
│ str     ┆ i64 ┆ str         ┆ i64    ┆ bool      │
╞═════════╪═════╪═════════════╪════════╪═══════════╡
│ Alice   ┆ 28  ┆ Engineering ┆ 75000  ┆ true      │
│ Charlie ┆ 42  ┆ Engineering ┆ 95000  ┆ false     │
│ Diana   ┆ 31  ┆ Marketing   ┆ 70000  ┆ true      │
└─────────┴─────┴─────────────┴────────┴───────────┘

NOT Conditions with `~`

# Find employees NOT in Sales
result = df.filter(
    ~(pl.col("department") == "Sales")
)
print(result)

You can also use .ne() or != for simple negations, but ~ is essential for negating complex expressions.

Combining All Three

# Active employees over 30 who are NOT in Sales
result = df.filter(
    (pl.col("age") > 30) & 
    (pl.col("is_active") == True) & 
    ~(pl.col("department") == "Sales")
)
print(result)

Output:

shape: (1, 5)
┌───────┬─────┬───────────┬────────┬───────────┐
│ name  ┆ age ┆ department┆ salary ┆ is_active │
│ ---   ┆ --- ┆ ---       ┆ ---    ┆ ---       │
│ str   ┆ i64 ┆ str       ┆ i64    ┆ bool      │
╞═══════╪═════╪═══════════╪════════╪═══════════╡
│ Diana ┆ 31  ┆ Marketing ┆ 70000  ┆ true      │
└───────┴─────┴───────────┴────────┴───────────┘

Using `pl.all_horizontal()` and `pl.any_horizontal()`

When you have many conditions with the same logical operator, chaining & or | becomes unwieldy. Polars provides pl.all_horizontal() (all conditions must be true) and pl.any_horizontal() (at least one condition must be true) for cleaner code.

# Using all_horizontal for multiple AND conditions
result = df.filter(
    pl.all_horizontal(
        pl.col("age") > 25,
        pl.col("salary") > 60000,
        pl.col("is_active") == True
    )
)
print(result)

Output:

shape: (3, 5)
┌───────┬─────┬─────────────┬────────┬───────────┐
│ name  ┆ age ┆ department  ┆ salary ┆ is_active │
│ ---   ┆ --- ┆ ---         ┆ ---    ┆ ---       │
│ str   ┆ i64 ┆ str         ┆ i64    ┆ bool      │
╞═══════╪═════╪═════════════╪════════╪═══════════╡
│ Alice ┆ 28  ┆ Engineering ┆ 75000  ┆ true      │
│ Bob   ┆ 35  ┆ Sales       ┆ 65000  ┆ true      │
│ Diana ┆ 31  ┆ Marketing   ┆ 70000  ┆ true      │
└───────┴─────┴─────────────┴────────┴───────────┘

# Using any_horizontal for multiple OR conditions
result = df.filter(
    pl.any_horizontal(
        pl.col("department") == "Engineering",
        pl.col("salary") > 90000,
        pl.col("age") < 26
    )
)
print(result)

These functions accept any number of boolean expressions and are particularly useful when building conditions programmatically.

Filtering with `.is_in()` for Multiple Values

When checking if a column matches any value from a list, .is_in() is cleaner and faster than chaining OR conditions:

# Instead of this verbose approach
verbose_result = df.filter(
    (pl.col("department") == "Engineering") | 
    (pl.col("department") == "Marketing")
)

# Use is_in()
clean_result = df.filter(
    pl.col("department").is_in(["Engineering", "Marketing"])
)
print(clean_result)

Output:

shape: (3, 5)
┌─────────┬─────┬─────────────┬────────┬───────────┐
│ name    ┆ age ┆ department  ┆ salary ┆ is_active │
│ ---     ┆ --- ┆ ---         ┆ ---    ┆ ---       │
│ str     ┆ i64 ┆ str         ┆ i64    ┆ bool      │
╞═════════╪═════════════════════╪════════╪═══════════╡
│ Alice   ┆ 28  ┆ Engineering ┆ 75000  ┆ true      │
│ Charlie ┆ 42  ┆ Engineering ┆ 95000  ┆ false     │
│ Diana   ┆ 31  ┆ Marketing   ┆ 70000  ┆ true      │
└─────────┴─────┴─────────────┴────────┴───────────┘

Combine .is_in() with other conditions naturally:

# Active employees in specific departments with salary above threshold
result = df.filter(
    (pl.col("department").is_in(["Engineering", "Marketing"])) &
    (pl.col("is_active") == True) &
    (pl.col("salary") >= 70000)
)

Dynamic Filtering with Expression Lists

Production code often requires building filters dynamically—from user input, configuration files, or API parameters. Here’s how to construct filters programmatically:

from functools import reduce
import operator

# Define conditions as a dictionary
filter_config = {
    "age": ("gt", 25),
    "salary": ("gte", 60000),
    "is_active": ("eq", True)
}

def build_condition(column: str, op: str, value) -> pl.Expr:
    """Convert a filter specification to a Polars expression."""
    col = pl.col(column)
    operations = {
        "eq": lambda c, v: c == v,
        "ne": lambda c, v: c != v,
        "gt": lambda c, v: c > v,
        "gte": lambda c, v: c >= v,
        "lt": lambda c, v: c < v,
        "lte": lambda c, v: c <= v,
    }
    return operations[op](col, value)

# Build list of conditions
conditions = [
    build_condition(col, op, val) 
    for col, (op, val) in filter_config.items()
]

# Combine with reduce for AND logic
combined_filter = reduce(operator.and_, conditions)
result = df.filter(combined_filter)
print(result)

For simpler cases, pass a list directly to pl.all_horizontal():

# Dynamic conditions list
conditions = [
    pl.col("age") > 25,
    pl.col("salary") > 60000,
]

# Add conditions based on runtime logic
if include_active_only:
    conditions.append(pl.col("is_active") == True)

result = df.filter(pl.all_horizontal(*conditions))

Performance Tips and Best Practices

Use Lazy Mode for Large Datasets

Lazy evaluation enables predicate pushdown—Polars pushes your filter conditions down to the data source, filtering rows before loading them into memory:

# Eager mode: loads all data, then filters
eager_result = pl.read_csv("large_file.csv").filter(
    pl.col("status") == "active"
)

# Lazy mode: filters during scan
lazy_result = (
    pl.scan_csv("large_file.csv")
    .filter(pl.col("status") == "active")
    .collect()
)

Inspect the query plan with .explain():

query = (
    pl.scan_csv("large_file.csv")
    .filter((pl.col("age") > 30) & (pl.col("status") == "active"))
)
print(query.explain())

You’ll see FILTER operations pushed down to the CSV SCAN step.

Order Conditions by Selectivity

Put the most restrictive conditions first. Polars evaluates conditions left to right, and eliminating rows early reduces work for subsequent conditions:

# Better: rare condition first
df.filter(
    (pl.col("premium_tier") == "platinum") &  # ~1% of rows
    (pl.col("is_active") == True)              # ~80% of rows
)

# Worse: common condition first
df.filter(
    (pl.col("is_active") == True) &            # ~80% of rows
    (pl.col("premium_tier") == "platinum")     # ~1% of rows
)

Avoid Anti-Patterns

Don’t use Python’s and/or keywords—they don’t work with Polars expressions:

# WRONG: Python keywords
df.filter(pl.col("age") > 30 and pl.col("is_active"))  # Raises error

# CORRECT: Bitwise operators with parentheses
df.filter((pl.col("age") > 30) & (pl.col("is_active")))

Don’t filter repeatedly when you can combine conditions:

# Inefficient: multiple filter calls
result = df.filter(pl.col("age") > 30)
result = result.filter(pl.col("is_active") == True)

# Efficient: single filter with combined conditions
result = df.filter(
    (pl.col("age") > 30) & (pl.col("is_active") == True)
)

Polars’ filtering capabilities strike a balance between expressiveness and performance. Master these patterns, and you’ll write cleaner, faster data processing code.

Introduction

Basic Single-Condition Filtering Refresher

Combining Conditions with Logical Operators

AND Conditions with &

OR Conditions with |

NOT Conditions with ~

Combining All Three

Using pl.all_horizontal() and pl.any_horizontal()

Filtering with .is_in() for Multiple Values

Dynamic Filtering with Expression Lists

Performance Tips and Best Practices

Use Lazy Mode for Large Datasets

Order Conditions by Selectivity

Avoid Anti-Patterns

Liked this? There's more.

Similar Articles

AND Conditions with `&`

OR Conditions with `|`

NOT Conditions with `~`

Using `pl.all_horizontal()` and `pl.any_horizontal()`

Filtering with `.is_in()` for Multiple Values