NumPy - Comparison Operators (==, !=, <, >, <=, >=)

NumPy's comparison operators (`==`, `!=`, `<`, `>`, `<=`, `>=`) work element-by-element on arrays, returning boolean arrays of the same shape. Unlike Python's built-in operators that return single...

Key Insights

  • NumPy comparison operators perform element-wise comparisons and return boolean arrays, enabling vectorized conditional logic that’s 10-100x faster than Python loops
  • Broadcasting rules allow comparing arrays of different shapes, automatically expanding dimensions to match compatible arrays without copying data
  • Boolean arrays from comparisons integrate directly with indexing, np.where(), np.any(), and np.all() for powerful data filtering and conditional operations

Element-Wise Comparison Basics

NumPy’s comparison operators (==, !=, <, >, <=, >=) work element-by-element on arrays, returning boolean arrays of the same shape. Unlike Python’s built-in operators that return single boolean values for lists, NumPy operators vectorize the comparison across all elements.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Element-wise comparisons
print(arr > 3)        # [False False False  True  True]
print(arr == 3)       # [False False  True False False]
print(arr <= 2)       # [ True  True False False False]
print(arr != 4)       # [ True  True  True False  True]

This works with multi-dimensional arrays identically:

matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

result = matrix >= 5
print(result)
# [[False False False]
#  [False  True  True]
#  [ True  True  True]]

print(result.dtype)  # bool

Comparing Arrays with Arrays

When comparing two arrays, NumPy performs element-wise comparison at matching positions. Arrays must have compatible shapes (same shape or broadcastable).

arr1 = np.array([10, 20, 30, 40])
arr2 = np.array([15, 20, 25, 50])

print(arr1 > arr2)   # [False False  True False]
print(arr1 == arr2)  # [False  True False False]

# Multi-dimensional comparison
mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[1, 3], [2, 4]])

print(mat1 < mat2)
# [[False  True]
#  [False False]]

For floating-point comparisons with tolerance, use np.isclose() or np.allclose():

arr1 = np.array([1.0, 2.0, 3.0])
arr2 = np.array([1.0000001, 2.0, 3.0000001])

# Direct comparison may fail due to floating-point precision
print(arr1 == arr2)  # [ True  True False]

# Use isclose for tolerance-based comparison
print(np.isclose(arr1, arr2))  # [ True  True  True]
print(np.allclose(arr1, arr2))  # True (single boolean)

Broadcasting in Comparisons

Broadcasting allows comparing arrays of different shapes by automatically expanding dimensions. This eliminates explicit loops and temporary array creation.

# Compare 2D array with 1D array
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

row = np.array([2, 5, 8])

# Broadcasting: row is compared against each row of matrix
result = matrix > row
print(result)
# [[False False  True]
#  [ True False False]
#  [False False  True]]

# Compare with column vector
col = np.array([[3], [6], [9]])
result = matrix < col
print(result)
# [[ True  True False]
#  [ True  True False]
#  [ True  True False]]

Broadcasting with scalars is the most common pattern:

data = np.array([[10, 20, 30],
                 [40, 50, 60]])

# Scalar broadcasts to all elements
threshold = 35
mask = data > threshold
print(mask)
# [[False False False]
#  [ True  True  True]]

Boolean Indexing with Comparison Results

Boolean arrays from comparisons serve as masks for filtering data, enabling concise conditional selection without explicit loops.

temperatures = np.array([72, 85, 91, 68, 77, 95, 88])

# Select temperatures above 80
hot_days = temperatures[temperatures > 80]
print(hot_days)  # [85 91 95 88]

# Multiple conditions with logical operators
comfortable = temperatures[(temperatures >= 70) & (temperatures <= 85)]
print(comfortable)  # [72 85 68 77]

# 2D boolean indexing
data = np.array([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]])

# Get all elements greater than 5
print(data[data > 5])  # [6 7 8 9]

# Replace values conditionally
data_copy = data.copy()
data_copy[data_copy < 5] = 0
print(data_copy)
# [[0 0 0]
#  [0 5 6]
#  [7 8 9]]

Conditional Operations with np.where()

np.where() provides vectorized if-else logic, selecting values from two arrays based on a condition.

scores = np.array([45, 78, 92, 65, 88, 54])

# Ternary operation: pass if >= 60, fail otherwise
results = np.where(scores >= 60, 'Pass', 'Fail')
print(results)
# ['Fail' 'Pass' 'Pass' 'Pass' 'Pass' 'Fail']

# Numerical transformation
adjusted = np.where(scores < 60, scores + 10, scores)
print(adjusted)  # [55 78 92 65 88 64]

# Multiple conditions using nested np.where()
grades = np.where(scores >= 90, 'A',
                  np.where(scores >= 80, 'B',
                          np.where(scores >= 70, 'C', 'D')))
print(grades)  # ['D' 'C' 'A' 'D' 'B' 'D']

For complex multi-condition scenarios, use np.select():

conditions = [
    scores >= 90,
    (scores >= 80) & (scores < 90),
    (scores >= 70) & (scores < 80),
    scores < 70
]
choices = ['A', 'B', 'C', 'D']

grades = np.select(conditions, choices)
print(grades)  # ['D' 'C' 'A' 'D' 'B' 'D']

Aggregating Boolean Arrays

Use np.any() and np.all() to reduce boolean arrays to single values, with optional axis specification for multi-dimensional arrays.

data = np.array([1, 2, 3, 4, 5])

print(np.any(data > 4))   # True (at least one element > 4)
print(np.all(data > 0))   # True (all elements > 0)
print(np.all(data > 2))   # False (not all elements > 2)

# Count True values
print(np.sum(data > 2))   # 3 (True=1, False=0)
print(np.count_nonzero(data > 2))  # 3 (alternative)

# Multi-dimensional aggregation
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# Check along axes
print(np.any(matrix > 5, axis=0))  # [False  True  True]
print(np.all(matrix > 0, axis=1))  # [ True  True  True]

# Find positions of True values
indices = np.where(matrix > 5)
print(indices)  # (array([1, 2, 2, 2]), array([2, 0, 1, 2]))
print(list(zip(indices[0], indices[1])))  # [(1, 2), (2, 0), (2, 1), (2, 2)]

Performance Considerations

NumPy comparisons vastly outperform Python loops due to vectorization and C-level implementation.

import time

# Large dataset
large_array = np.random.randint(0, 100, size=1000000)

# NumPy vectorized approach
start = time.time()
result_np = large_array > 50
numpy_time = time.time() - start

# Python list comprehension approach
start = time.time()
result_py = [x > 50 for x in large_array.tolist()]
python_time = time.time() - start

print(f"NumPy: {numpy_time:.6f}s")
print(f"Python: {python_time:.6f}s")
print(f"Speedup: {python_time/numpy_time:.1f}x")
# Typical output: NumPy: 0.001s, Python: 0.08s, Speedup: 80x

Memory efficiency matters for boolean indexing:

# Memory-efficient: boolean array is smaller than data
data = np.random.rand(1000000)
mask = data > 0.5  # bool array: 1 byte per element
filtered = data[mask]  # Creates new array only for True values

# Less efficient: creating intermediate arrays
# Avoid: data[np.where(data > 0.5)[0]]
# Prefer: data[data > 0.5]

Chain comparisons efficiently using bitwise operators (&, |, ~) with parentheses:

values = np.array([15, 25, 35, 45, 55])

# Correct: use parentheses with bitwise operators
valid = (values > 20) & (values < 50)
print(values[valid])  # [25 35 45]

# Wrong: comparison operators have higher precedence
# valid = values > 20 & values < 50  # Incorrect logic

NumPy comparison operators form the foundation for data filtering, conditional transformations, and logical operations in scientific computing. Master these patterns to write efficient, readable array-processing code.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.