How to Filter by Multiple Conditions in Pandas

Filtering DataFrames by multiple conditions is one of the most common operations in data analysis. Whether you're isolating customers who meet specific criteria, cleaning datasets by removing...

Key Insights

  • Always wrap individual conditions in parentheses when combining them with &, |, or ~ operators—Python’s operator precedence will break your code otherwise
  • The .query() method offers cleaner syntax for complex filters and lets you reference external variables with the @ prefix
  • Use .isin() for “one of many” conditions instead of chaining multiple OR statements—it’s both faster and more readable

Filtering DataFrames by multiple conditions is one of the most common operations in data analysis. Whether you’re isolating customers who meet specific criteria, cleaning datasets by removing outliers, or segmenting data for analysis, you’ll combine conditions constantly. Yet I see developers struggle with this regularly, usually because they forget parentheses or use the wrong operators.

This guide covers every practical method for multi-condition filtering in pandas, from basic boolean indexing to advanced .loc[] patterns.

Single Condition Refresher

Before combining conditions, let’s establish the baseline. Boolean indexing with a single condition is straightforward:

import pandas as pd

# Sample data
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'age': [28, 35, 42, 31, 25],
    'department': ['Sales', 'Engineering', 'Sales', 'Marketing', 'Engineering'],
    'salary': [55000, 75000, 62000, 58000, 70000],
    'status': ['active', 'active', 'inactive', 'active', 'pending']
})

# Filter employees over 30
older_employees = df[df['age'] > 30]
print(older_employees)

The expression df['age'] > 30 creates a boolean Series with True for rows meeting the condition. Passing this Series back into the DataFrame’s brackets returns only matching rows.

This works perfectly for simple filters. The complexity emerges when you need multiple conditions.

Combining Conditions with Logical Operators

Pandas uses bitwise operators for combining boolean conditions: & for AND, | for OR, and ~ for NOT. The critical detail that trips up nearly everyone: you must wrap each condition in parentheses.

The AND Operator (&)

Find employees over 30 who earn more than $60,000:

# Correct: parentheses around each condition
high_earners = df[(df['age'] > 30) & (df['salary'] > 60000)]
print(high_earners)

Without parentheses, Python’s operator precedence causes & to bind before the comparison operators, resulting in a cryptic error:

# Wrong: This raises a ValueError
# df[df['age'] > 30 & df['salary'] > 60000]

The OR Operator (|)

Find employees in Sales or Marketing:

sales_or_marketing = df[(df['department'] == 'Sales') | (df['department'] == 'Marketing')]
print(sales_or_marketing)

You can chain multiple OR conditions:

# Employees under 30 OR over 40 OR in Engineering
subset = df[
    (df['age'] < 30) | 
    (df['age'] > 40) | 
    (df['department'] == 'Engineering')
]

The NOT Operator (~)

Invert any boolean condition with ~:

# Everyone NOT in Sales
not_sales = df[~(df['department'] == 'Sales')]

# Equivalent to:
not_sales_alt = df[df['department'] != 'Sales']

The ~ operator becomes essential when inverting complex conditions:

# Everyone who is NOT (over 30 AND in Sales)
result = df[~((df['age'] > 30) & (df['department'] == 'Sales'))]

Combining AND and OR

Real-world filters often mix operators. Use parentheses liberally to make intent clear:

# Active employees who are either in Sales OR earn over $65,000
complex_filter = df[
    (df['status'] == 'active') & 
    ((df['department'] == 'Sales') | (df['salary'] > 65000))
]
print(complex_filter)

Using the .query() Method

The .query() method provides a string-based syntax that many find more readable, especially for complex filters. It also avoids the parentheses requirement:

# Same filter as above, but cleaner
result = df.query('age > 30 and salary > 60000')
print(result)

# Mixed conditions with natural syntax
result = df.query('age > 30 and department == "Sales"')

Notice that you use and, or, and not instead of &, |, and ~. String values need quotes inside the query string—use different quote types for the outer and inner strings.

Referencing External Variables

Use the @ prefix to reference Python variables within a query:

min_age = 30
target_dept = 'Engineering'
min_salary = 65000

result = df.query('age > @min_age and department == @target_dept and salary >= @min_salary')
print(result)

This is particularly useful when building dynamic filters:

def filter_employees(df, age_threshold, departments):
    dept_list = departments  # Must be a variable, not inline
    return df.query('age > @age_threshold and department in @dept_list')

filtered = filter_employees(df, 25, ['Sales', 'Marketing'])

When to Use .query()

I prefer .query() when:

  • Conditions involve more than two clauses
  • The filter logic is complex enough that parentheses become confusing
  • I’m building filters dynamically with variables

I stick with boolean indexing when:

  • The condition is simple (one or two clauses)
  • I need to use methods or functions on columns within the condition
  • Performance is critical (boolean indexing is slightly faster)

Filtering with .isin() for Multiple Values

When checking if a column value matches any item in a list, .isin() is cleaner and faster than chaining OR conditions:

# Instead of this:
# df[(df['status'] == 'active') | (df['status'] == 'pending') | (df['status'] == 'review')]

# Use this:
valid_statuses = ['active', 'pending']
result = df[df['status'].isin(valid_statuses)]
print(result)

Combining .isin() with Other Conditions

.isin() returns a boolean Series, so it combines naturally with other conditions:

# Active or pending employees in Engineering or Sales, earning over $60k
result = df[
    (df['status'].isin(['active', 'pending'])) &
    (df['department'].isin(['Engineering', 'Sales'])) &
    (df['salary'] > 60000)
]
print(result)

Negating .isin()

Use ~ to exclude values:

# Everyone NOT in inactive status
active_employees = df[~df['status'].isin(['inactive'])]

Advanced: .loc[] with Multiple Conditions

The .loc[] accessor lets you filter rows and select columns simultaneously. This is the most explicit and often most performant approach:

# Filter rows AND select specific columns
result = df.loc[
    (df['age'] > 30) & (df['department'] == 'Sales'),
    ['name', 'salary']
]
print(result)

This returns only the name and salary columns for rows matching both conditions.

Modifying Filtered Data

.loc[] is essential when you need to modify filtered data:

# Give a 10% raise to all active Sales employees over 30
df.loc[
    (df['age'] > 30) & 
    (df['department'] == 'Sales') & 
    (df['status'] == 'active'),
    'salary'
] *= 1.10

Using .loc[] here avoids the SettingWithCopyWarning that plagues chained indexing.

Performance Tips and Common Pitfalls

Chained Comparisons Don’t Work

Python’s elegant chained comparisons don’t work directly in pandas:

# This doesn't work as expected:
# df[20 < df['age'] < 40]

# Use this instead:
df[(df['age'] > 20) & (df['age'] < 40)]

# Or with .query():
df.query('20 < age < 40')  # This actually works in .query()!

The .query() method handles chained comparisons correctly, which is a nice advantage.

Avoid Repeated Filtering

Each filter operation creates a new DataFrame. Chain conditions in a single operation rather than filtering multiple times:

# Inefficient: creates intermediate DataFrames
result = df[df['age'] > 30]
result = result[result['department'] == 'Sales']
result = result[result['salary'] > 50000]

# Efficient: single operation
result = df[
    (df['age'] > 30) & 
    (df['department'] == 'Sales') & 
    (df['salary'] > 50000)
]

Use np.logical_and.reduce() for Many Conditions

When building filters programmatically with many conditions, use NumPy’s reduce:

import numpy as np

conditions = [
    df['age'] > 25,
    df['salary'] > 50000,
    df['status'] == 'active',
    df['department'].isin(['Sales', 'Engineering'])
]

combined = np.logical_and.reduce(conditions)
result = df[combined]

This approach is cleaner than manually chaining & operators and makes it easy to build condition lists dynamically.

String Methods in Filters

For string-based conditions, combine with .str accessor methods:

# Names starting with 'A' or 'B', in Sales
result = df[
    (df['name'].str.startswith('A') | df['name'].str.startswith('B')) &
    (df['department'] == 'Sales')
]

Multi-condition filtering in pandas is straightforward once you internalize the parentheses requirement and understand when to use each method. Start with boolean indexing for simple cases, graduate to .query() for complex filters, and always use .loc[] when you need to modify data or select specific columns.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.