NumPy - Array Slicing with Examples
NumPy array slicing follows Python's standard slicing convention but extends it to multiple dimensions. The basic syntax `[start:stop:step]` creates a view into the original array rather than copying...
Key Insights
- NumPy slicing uses the syntax
array[start:stop:step]and supports negative indexing, enabling powerful data extraction patterns without copying underlying data - Multi-dimensional arrays require comma-separated slice notation
array[row_slice, col_slice], with ellipsis (...) available for selecting all dimensions - Boolean and fancy indexing extend basic slicing capabilities, allowing conditional filtering and arbitrary index-based selection for complex data manipulation
Basic Slicing Syntax
NumPy array slicing follows Python’s standard slicing convention but extends it to multiple dimensions. The basic syntax [start:stop:step] creates a view into the original array rather than copying data, making operations memory-efficient.
import numpy as np
arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# Basic slicing
print(arr[2:7]) # [2 3 4 5 6]
print(arr[:5]) # [0 1 2 3 4] - from start to index 5
print(arr[7:]) # [7 8 9] - from index 7 to end
print(arr[::2]) # [0 2 4 6 8] - every second element
print(arr[1::3]) # [1 4 7] - start at 1, step by 3
Negative indices count from the end of the array, and negative steps reverse the selection direction:
arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
print(arr[-3:]) # [7 8 9] - last three elements
print(arr[:-3]) # [0 1 2 3 4 5 6] - all except last three
print(arr[-5:-2]) # [5 6 7] - slice using negative indices
print(arr[::-1]) # [9 8 7 6 5 4 3 2 1 0] - reverse array
print(arr[8:2:-1]) # [8 7 6 5 4 3] - reverse slice with bounds
Multi-Dimensional Array Slicing
Multi-dimensional arrays require separate slice specifications for each dimension, separated by commas. Omitting a dimension selects all elements along that axis.
# Create a 2D array
matrix = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]
])
# Row slicing
print(matrix[1:3])
# [[ 5 6 7 8]
# [ 9 10 11 12]]
# Column slicing (all rows, specific columns)
print(matrix[:, 1:3])
# [[ 2 3]
# [ 6 7]
# [10 11]
# [14 15]]
# Combined row and column slicing
print(matrix[1:3, 1:3])
# [[ 6 7]
# [10 11]]
# Step slicing in 2D
print(matrix[::2, ::2]) # Every other row and column
# [[ 1 3]
# [ 9 11]]
For higher-dimensional arrays, the ellipsis (...) notation selects all remaining dimensions:
# Create a 3D array (2x3x4)
cube = np.arange(24).reshape(2, 3, 4)
# Select all of first matrix
print(cube[0, ...])
# [[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]]
# Select middle column across all matrices
print(cube[..., 1])
# [[ 1 5 9]
# [13 17 21]]
# Equivalent to cube[:, :, 1:3]
print(cube[..., 1:3])
Modifying Arrays Through Slices
Since slices create views rather than copies, modifications affect the original array. Use .copy() to create independent arrays.
original = np.array([0, 1, 2, 3, 4, 5])
view = original[2:5]
# Modify the view
view[0] = 99
print(original) # [0 1 99 3 4 5] - original changed
# Create independent copy
original = np.array([0, 1, 2, 3, 4, 5])
independent = original[2:5].copy()
independent[0] = 99
print(original) # [0 1 2 3 4 5] - original unchanged
Broadcasting works with slice assignment:
matrix = np.zeros((4, 4))
# Assign value to slice
matrix[1:3, 1:3] = 5
print(matrix)
# [[0. 0. 0. 0.]
# [0. 5. 5. 0.]
# [0. 5. 5. 0.]
# [0. 0. 0. 0.]]
# Assign array to slice
matrix[0, :] = np.array([1, 2, 3, 4])
print(matrix[0]) # [1. 2. 3. 4.]
Boolean Indexing
Boolean arrays enable conditional selection, returning elements where the condition evaluates to True. This creates a copy rather than a view.
arr = np.array([10, 15, 20, 25, 30, 35, 40])
# Create boolean mask
mask = arr > 25
print(mask) # [False False False False True True True]
# Select elements
print(arr[mask]) # [30 35 40]
# Direct conditional indexing
print(arr[arr % 2 == 0]) # [10 20 30 40] - even numbers
# Multiple conditions with logical operators
print(arr[(arr > 15) & (arr < 35)]) # [20 25 30]
print(arr[(arr < 15) | (arr > 35)]) # [10 40]
Boolean indexing works with multi-dimensional arrays:
matrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
# Select elements greater than 5
print(matrix[matrix > 5]) # [6 7 8 9] - flattened result
# Modify elements conditionally
matrix[matrix % 2 == 0] = 0
print(matrix)
# [[1 0 3]
# [0 5 0]
# [7 0 9]]
Fancy Indexing
Fancy indexing uses integer arrays or lists to select arbitrary elements. Unlike basic slicing, this creates copies.
arr = np.array([10, 20, 30, 40, 50, 60, 70])
# Index with list
indices = [0, 2, 5]
print(arr[indices]) # [10 30 60]
# Index with array
idx_array = np.array([1, 3, 6])
print(arr[idx_array]) # [20 40 70]
# Repeated indices
print(arr[[0, 0, 2, 2]]) # [10 10 30 30]
Fancy indexing in multiple dimensions:
matrix = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]
])
# Select specific elements
rows = np.array([0, 1, 2])
cols = np.array([1, 2, 3])
print(matrix[rows, cols]) # [ 2 7 12] - diagonal elements
# Select entire rows
row_indices = [0, 2]
print(matrix[row_indices])
# [[ 1 2 3 4]
# [ 9 10 11 12]]
# Combine with slicing
print(matrix[[0, 2], 1:3])
# [[ 2 3]
# [10 11]]
Advanced Slicing Patterns
Combining slicing techniques enables complex data extraction patterns common in data analysis workflows.
# Create sample dataset
data = np.random.randint(0, 100, size=(10, 5))
# Extract every other row, first three columns
subset = data[::2, :3]
# Get rows where first column > 50
filtered = data[data[:, 0] > 50]
# Select specific rows and columns
rows_of_interest = [1, 3, 7]
cols_of_interest = [0, 2, 4]
selection = data[np.ix_(rows_of_interest, cols_of_interest)]
Working with structured data extraction:
# Simulate time-series data (100 samples, 3 features)
timeseries = np.random.randn(100, 3)
# Extract windows
window_size = 10
windows = np.array([timeseries[i:i+window_size] for i in range(0, 90, 10)])
print(windows.shape) # (9, 10, 3)
# Get last 20% of data for testing
split_point = int(len(timeseries) * 0.8)
train = timeseries[:split_point]
test = timeseries[split_point:]
# Extract specific feature columns
feature_1 = timeseries[:, 0]
features_2_3 = timeseries[:, 1:]
Understanding the distinction between views and copies prevents unexpected behavior. Basic slicing returns views for memory efficiency, while boolean and fancy indexing return copies for flexibility. Use np.shares_memory() to verify:
arr = np.array([1, 2, 3, 4, 5])
view = arr[1:4]
copy = arr[[1, 2, 3]]
print(np.shares_memory(arr, view)) # True
print(np.shares_memory(arr, copy)) # False