Pandas - Select Rows with Multiple Conditions (AND/OR)

Key Insights

Pandas provides multiple syntaxes for filtering rows with compound conditions: bitwise operators (&, |, ~), query() method, and loc[] with callable functions—each with distinct performance characteristics and readability trade-offs
Bitwise operators require parentheses around each condition due to operator precedence, while query() allows natural Python-like expressions without parentheses but with potential variable scope complications
For complex multi-condition filtering, combining isin(), between(), and boolean indexing often outperforms chained comparisons, especially on large datasets

Bitwise Operators for AND/OR Conditions

The most common approach uses bitwise operators: & (AND), | (OR), and ~ (NOT). Each condition must be wrapped in parentheses due to Python’s operator precedence.

import pandas as pd
import numpy as np

# Sample dataset
df = pd.DataFrame({
    'product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam'],
    'price': [1200, 25, 75, 350, 80],
    'stock': [5, 150, 45, 12, 8],
    'category': ['Electronics', 'Accessories', 'Accessories', 'Electronics', 'Accessories']
})

# AND condition: Electronics AND price > 500
electronics_expensive = df[(df['category'] == 'Electronics') & (df['price'] > 500)]
print(electronics_expensive)

Output:

   product  price  stock     category
0   Laptop   1200      5  Electronics

For OR conditions:

# OR condition: price < 50 OR stock > 100
cheap_or_abundant = df[(df['price'] < 50) | (df['stock'] > 100)]
print(cheap_or_abundant)

Output:

  product  price  stock     category
1   Mouse     25    150  Accessories

Combine multiple conditions with grouping:

# Complex: (Electronics AND price > 300) OR (Accessories AND stock < 10)
complex_filter = df[
    ((df['category'] == 'Electronics') & (df['price'] > 300)) |
    ((df['category'] == 'Accessories') & (df['stock'] < 10))
]
print(complex_filter)

Output:

   product  price  stock     category
0   Laptop   1200      5  Electronics
3  Monitor    350     12  Electronics
4   Webcam     80      8  Accessories

Using query() Method

The query() method provides cleaner syntax for complex conditions, accepting string expressions similar to SQL WHERE clauses.

# AND condition with query()
electronics_expensive = df.query('category == "Electronics" and price > 500')
print(electronics_expensive)

# OR condition
cheap_or_abundant = df.query('price < 50 or stock > 100')
print(cheap_or_abundant)

# Complex nested conditions
complex_filter = df.query(
    '(category == "Electronics" and price > 300) or '
    '(category == "Accessories" and stock < 10)'
)
print(complex_filter)

Using variables with query():

min_price = 300
target_category = 'Electronics'

# Use @ to reference variables
result = df.query('category == @target_category and price > @min_price')
print(result)

The query() method is particularly readable for complex conditions but has limitations with column names containing spaces or special characters.

loc[] with Boolean Indexing

The loc[] accessor combined with boolean indexing provides explicit control over row selection.

# Using loc with multiple conditions
result = df.loc[
    (df['category'] == 'Electronics') & (df['price'] > 500),
    ['product', 'price']  # Select specific columns
]
print(result)

Output:

  product  price
0  Laptop   1200

Using callable functions with loc[]:

# Callable approach for reusable filters
def expensive_electronics(df):
    return (df['category'] == 'Electronics') & (df['price'] > 500)

result = df.loc[expensive_electronics]
print(result)

isin() for Multiple Values

When checking if a column value matches any value in a list, isin() is more efficient than chaining OR conditions.

# Instead of: (df['product'] == 'Laptop') | (df['product'] == 'Monitor') | ...
target_products = ['Laptop', 'Monitor', 'Webcam']
result = df[df['product'].isin(target_products)]
print(result)

Output:

   product  price  stock     category
0   Laptop   1200      5  Electronics
3  Monitor    350     12  Electronics
4   Webcam     80      8  Accessories

Combining isin() with other conditions:

# Products in list AND price > 100
target_products = ['Laptop', 'Monitor', 'Webcam']
result = df[df['product'].isin(target_products) & (df['price'] > 100)]
print(result)

Use ~ for negation (NOT IN):

# Products NOT in the list
excluded_products = ['Mouse', 'Keyboard']
result = df[~df['product'].isin(excluded_products)]
print(result)

between() for Range Conditions

The between() method simplifies range checks, replacing two comparison operations.

# Price between 50 and 500 (inclusive by default)
mid_range = df[df['price'].between(50, 500)]
print(mid_range)

Output:

    product  price  stock     category
2  Keyboard     75     45  Accessories
3   Monitor    350     12  Electronics
4    Webcam     80      8  Accessories

Exclusive boundaries:

# Exclusive boundaries
result = df[df['price'].between(50, 500, inclusive='neither')]
print(result)

Combining between() with other conditions:

# Price between 50-500 AND Accessories category
result = df[
    df['price'].between(50, 500) & 
    (df['category'] == 'Accessories')
]
print(result)

String Matching with Multiple Conditions

For string columns, combine pattern matching with other conditions.

# Create sample data with text
df_text = pd.DataFrame({
    'product': ['Laptop Pro', 'Gaming Mouse', 'Mechanical Keyboard', 'LED Monitor', 'HD Webcam'],
    'price': [1200, 45, 120, 350, 80],
    'brand': ['Apple', 'Logitech', 'Corsair', 'Dell', 'Logitech']
})

# Products containing 'o' AND price > 100
result = df_text[
    df_text['product'].str.contains('o', case=False) & 
    (df_text['price'] > 100)
]
print(result)

Output:

            product  price    brand
0       Laptop Pro   1200    Apple
2  Mechanical Keyboard    120  Corsair
3       LED Monitor    350     Dell

Multiple string conditions:

# Brand is Logitech OR product starts with 'LED'
result = df_text[
    (df_text['brand'] == 'Logitech') | 
    df_text['product'].str.startswith('LED')
]
print(result)

Performance Considerations

For large datasets, method choice impacts performance significantly.

import time

# Create large dataset
large_df = pd.DataFrame({
    'A': np.random.randint(0, 100, 1000000),
    'B': np.random.randint(0, 100, 1000000),
    'C': np.random.choice(['X', 'Y', 'Z'], 1000000)
})

# Benchmark bitwise operators
start = time.time()
result1 = large_df[(large_df['A'] > 50) & (large_df['B'] < 30)]
time1 = time.time() - start

# Benchmark query()
start = time.time()
result2 = large_df.query('A > 50 and B < 30')
time2 = time.time() - start

print(f"Bitwise: {time1:.4f}s")
print(f"Query: {time2:.4f}s")

Generally, bitwise operators perform slightly faster for simple conditions, while query() can optimize complex expressions. For repeated filtering operations, pre-compute boolean masks:

# Pre-compute masks for reuse
mask_a = large_df['A'] > 50
mask_b = large_df['B'] < 30
mask_c = large_df['C'] == 'X'

# Combine masks efficiently
result = large_df[mask_a & mask_b & mask_c]

Handling Missing Values in Conditions

Missing values require explicit handling in multi-condition filters.

df_missing = pd.DataFrame({
    'product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor'],
    'price': [1200, np.nan, 75, 350],
    'stock': [5, 150, np.nan, 12]
})

# Exclude NaN values explicitly
result = df_missing[
    (df_missing['price'] > 100) & 
    (df_missing['price'].notna())
]
print(result)

For inclusive NaN handling:

# Include rows where price > 100 OR price is NaN
result = df_missing[
    (df_missing['price'] > 100) | 
    (df_missing['price'].isna())
]
print(result)

Choose your filtering approach based on expression complexity, performance requirements, and team coding standards. Bitwise operators offer the best performance for simple conditions, query() excels at readability for complex logic, and specialized methods like isin() and between() optimize specific use cases.

Bitwise Operators for AND/OR Conditions

Using query() Method

loc[] with Boolean Indexing

isin() for Multiple Values

between() for Range Conditions

String Matching with Multiple Conditions

Performance Considerations

Handling Missing Values in Conditions

Liked this? There's more.