NumPy - Array Indexing with Examples
NumPy arrays support Python's standard indexing syntax with zero-based indices. Single-dimensional arrays behave like Python lists, but multi-dimensional arrays extend this concept across multiple...
Key Insights
- NumPy provides multiple indexing methods including basic slicing, integer array indexing, and boolean indexing, each optimized for different access patterns and performance characteristics
- Understanding the difference between views and copies is critical—basic slicing returns views while fancy indexing creates copies, directly impacting memory usage and modification behavior
- Advanced indexing techniques like
np.ix_, broadcasting, and ellipsis notation enable elegant solutions to complex multi-dimensional array manipulation problems
Basic Indexing and Slicing
NumPy arrays support Python’s standard indexing syntax with zero-based indices. Single-dimensional arrays behave like Python lists, but multi-dimensional arrays extend this concept across multiple axes.
import numpy as np
# 1D array indexing
arr = np.array([10, 20, 30, 40, 50])
print(arr[0]) # 10
print(arr[-1]) # 50
print(arr[1:4]) # [20 30 40]
print(arr[::2]) # [10 30 50]
# 2D array indexing
arr_2d = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]])
print(arr_2d[0, 2]) # 3 (row 0, column 2)
print(arr_2d[1]) # [5 6 7 8] (entire row 1)
print(arr_2d[:, 2]) # [3 7 11] (entire column 2)
print(arr_2d[0:2, 1:3]) # [[2 3] [6 7]]
The slice notation start:stop:step works independently on each dimension. Omitting values uses defaults: start=0, stop=size, step=1. Basic slicing creates views, not copies, meaning modifications affect the original array.
arr = np.arange(10)
slice_view = arr[2:7]
slice_view[0] = 999
print(arr) # [0 1 999 3 4 5 6 7 8 9]
Integer Array Indexing (Fancy Indexing)
Integer array indexing allows selecting arbitrary elements using arrays of indices. Unlike basic slicing, this creates copies rather than views.
arr = np.array([10, 20, 30, 40, 50, 60])
# Select specific indices
indices = np.array([0, 2, 4])
print(arr[indices]) # [10 30 50]
# 2D array fancy indexing
arr_2d = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
rows = np.array([0, 2, 1])
cols = np.array([1, 2, 0])
print(arr_2d[rows, cols]) # [2 9 4]
This technique excels at non-contiguous element selection. The row and column index arrays must have compatible shapes—broadcasting rules apply.
# Selecting multiple rows with all columns
arr_2d = np.arange(20).reshape(4, 5)
row_indices = np.array([0, 2, 3])
print(arr_2d[row_indices, :])
# [[ 0 1 2 3 4]
# [10 11 12 13 14]
# [15 16 17 18 19]]
# Combining with slicing
print(arr_2d[row_indices, 1:4])
# [[ 1 2 3]
# [11 12 13]
# [16 17 18]]
Boolean Indexing
Boolean indexing uses boolean arrays as masks to filter elements. This is exceptionally powerful for conditional data extraction.
arr = np.array([12, 5, 18, 3, 25, 9])
# Create boolean mask
mask = arr > 10
print(mask) # [True False True False True False]
print(arr[mask]) # [12 18 25]
# Direct conditional indexing
print(arr[arr > 10]) # [12 18 25]
# Multiple conditions
print(arr[(arr > 5) & (arr < 20)]) # [12 18 9]
Boolean indexing always returns a copy. The mask must have the same shape as the array being indexed, or be broadcastable to that shape.
# 2D boolean indexing
arr_2d = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
# Mask applied element-wise
print(arr_2d[arr_2d > 5]) # [6 7 8 9] (flattened)
# Row-wise filtering
row_sums = arr_2d.sum(axis=1)
print(arr_2d[row_sums > 12])
# [[4 5 6]
# [7 8 9]]
Advanced Indexing with np.ix_
The np.ix_ function constructs open mesh grids from multiple sequences, enabling elegant multi-dimensional indexing.
arr = np.arange(35).reshape(5, 7)
# Select rows 1, 3, 4 and columns 0, 2, 5
rows = np.array([1, 3, 4])
cols = np.array([0, 2, 5])
# Without np.ix_ - requires reshaping
result1 = arr[rows[:, np.newaxis], cols]
# With np.ix_ - cleaner syntax
result2 = arr[np.ix_(rows, cols)]
print(result2)
# [[ 7 9 12]
# [21 23 26]
# [28 30 33]]
This approach scales to any number of dimensions and maintains code readability.
# 3D example
arr_3d = np.arange(60).reshape(3, 4, 5)
dim0_idx = [0, 2]
dim1_idx = [1, 3]
dim2_idx = [0, 2, 4]
result = arr_3d[np.ix_(dim0_idx, dim1_idx, dim2_idx)]
print(result.shape) # (2, 2, 3)
Ellipsis and Newaxis
The ellipsis (...) represents multiple colons needed to select all remaining dimensions. np.newaxis (or None) adds a new axis of length one.
arr = np.arange(24).reshape(2, 3, 4)
# These are equivalent
print(arr[0, :, :].shape) # (3, 4)
print(arr[0, ...].shape) # (3, 4)
print(arr[..., 0].shape) # (2, 3)
# Newaxis for broadcasting
arr_1d = np.array([1, 2, 3])
arr_2d = np.array([[10], [20], [30]])
# Add axis to make shapes compatible
result = arr_1d[np.newaxis, :] + arr_2d
print(result)
# [[11 12 13]
# [21 22 23]
# [31 32 33]]
Modifying Arrays Through Indexing
Assignment through indexing follows the same rules as selection. Basic slicing allows in-place modification; fancy indexing requires understanding copy behavior.
arr = np.arange(10)
# Basic slicing - modifies original
arr[2:5] = 0
print(arr) # [0 1 0 0 0 5 6 7 8 9]
# Fancy indexing - modifies original despite creating copy during selection
arr = np.arange(10)
indices = [1, 3, 5]
arr[indices] = 99
print(arr) # [0 99 2 99 4 99 6 7 8 9]
# Boolean indexing - modifies original
arr = np.arange(10)
arr[arr > 5] = -1
print(arr) # [0 1 2 3 4 5 -1 -1 -1 -1]
Broadcasting applies during assignment when shapes are compatible.
arr_2d = np.zeros((4, 5))
arr_2d[:, 1] = [10, 20, 30, 40] # Broadcasting column assignment
arr_2d[2, :] = 99 # Broadcasting row assignment
print(arr_2d)
Performance Considerations
Different indexing methods have distinct performance characteristics. Basic slicing is fastest (no data copy), while fancy indexing requires memory allocation.
import time
arr = np.random.rand(1000, 1000)
# Basic slicing - view (fast)
start = time.time()
for _ in range(1000):
view = arr[100:200, 100:200]
print(f"Slicing: {time.time() - start:.4f}s")
# Fancy indexing - copy (slower)
indices = np.arange(100, 200)
start = time.time()
for _ in range(1000):
copy = arr[indices, :]
print(f"Fancy indexing: {time.time() - start:.4f}s")
# Boolean indexing - copy (slowest for large masks)
start = time.time()
for _ in range(1000):
mask = arr > 0.5
filtered = arr[mask]
print(f"Boolean indexing: {time.time() - start:.4f}s")
Use np.take and np.compress for optimized fancy and boolean indexing when performance is critical.
arr = np.random.rand(10000)
indices = np.random.randint(0, 10000, 1000)
# Faster than arr[indices]
result = np.take(arr, indices)
# Faster than arr[arr > 0.5]
result = np.compress(arr > 0.5, arr)
NumPy indexing provides the foundation for efficient array manipulation. Master these techniques to write concise, performant numerical code that leverages NumPy’s optimized C implementations rather than Python loops.