NumPy - Reshape Array (np.reshape)

Array reshaping changes the dimensionality of an array without altering its data. NumPy stores arrays as contiguous blocks of memory with metadata describing shape and strides. When you reshape,...

Key Insights

  • np.reshape() transforms array dimensions without copying data when possible, using strides to create views that share memory with the original array
  • Reshaping requires compatible dimensions where the total number of elements remains constant, or use -1 to automatically calculate one dimension
  • Understanding C-contiguous vs Fortran-contiguous memory layouts is critical for performance optimization and avoiding unnecessary copies

Understanding Array Reshaping Fundamentals

Array reshaping changes the dimensionality of an array without altering its data. NumPy stores arrays as contiguous blocks of memory with metadata describing shape and strides. When you reshape, NumPy attempts to create a view—a new array object that shares the underlying data buffer.

import numpy as np

# Create a 1D array
arr = np.arange(12)
print(f"Original shape: {arr.shape}")
print(arr)

# Reshape to 2D
reshaped = arr.reshape(3, 4)
print(f"\nReshaped to (3, 4):\n{reshaped}")

# Reshape to 3D
reshaped_3d = arr.reshape(2, 2, 3)
print(f"\nReshaped to (2, 2, 3):\n{reshaped_3d}")

Output:

Original shape: (12,)
[ 0  1  2  3  4  5  6  7  8  9 10 11]

Reshaped to (3, 4):
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Reshaped to (2, 2, 3):
[[[ 0  1  2]
  [ 3  4  5]]

 [[ 6  7  8]
  [ 9 10 11]]]

Automatic Dimension Calculation with -1

Use -1 for one dimension to let NumPy calculate it automatically based on the array size and other dimensions. This prevents calculation errors and makes code more maintainable.

arr = np.arange(24)

# Let NumPy calculate the second dimension
reshaped = arr.reshape(4, -1)
print(f"Shape (4, -1): {reshaped.shape}")
print(reshaped)

# Calculate first dimension
reshaped = arr.reshape(-1, 8)
print(f"\nShape (-1, 8): {reshaped.shape}")

# Works with multiple dimensions
reshaped = arr.reshape(2, 3, -1)
print(f"\nShape (2, 3, -1): {reshaped.shape}")

You can only use -1 once per reshape operation. Using it multiple times raises an error since NumPy cannot solve for multiple unknowns.

# This raises ValueError
try:
    arr.reshape(-1, -1)
except ValueError as e:
    print(f"Error: {e}")

Views vs Copies: Memory Efficiency

Reshaping returns a view when possible, avoiding memory allocation and copying. Views share data with the original array—modifying one affects the other. Understanding when NumPy creates copies is essential for performance.

arr = np.arange(12)
reshaped = arr.reshape(3, 4)

# Check if it's a view
print(f"Shares memory: {np.shares_memory(arr, reshaped)}")
print(f"Base is arr: {reshaped.base is arr}")

# Modify reshaped array
reshaped[0, 0] = 999
print(f"\nOriginal array after modification:\n{arr}")
print(f"Reshaped array:\n{reshaped}")

Output:

Shares memory: True
Base is arr: True

Original array after modification:
[999   1   2   3   4   5   6   7   8   9  10  11]
Reshaped array:
[[999   1   2   3]
 [  4   5   6   7]
 [  8   9  10  11]]

When reshape cannot create a view, it returns a copy. This happens when the memory layout is incompatible:

# Create non-contiguous array through slicing
arr = np.arange(20)[::2]  # Every other element
print(f"Is C-contiguous: {arr.flags['C_CONTIGUOUS']}")

reshaped = arr.reshape(2, 5)
print(f"Shares memory: {np.shares_memory(arr, reshaped)}")

# Modifying doesn't affect original
reshaped[0, 0] = 999
print(f"Original: {arr[0]}")

C-Contiguous vs Fortran-Contiguous Order

NumPy supports two memory layouts: C-contiguous (row-major) and Fortran-contiguous (column-major). The order parameter controls this behavior.

arr = np.arange(12)

# C-contiguous (default) - row-major order
c_order = arr.reshape(3, 4, order='C')
print("C-contiguous (row-major):")
print(c_order)
print(f"Flags: C={c_order.flags['C_CONTIGUOUS']}, F={c_order.flags['F_CONTIGUOUS']}")

# Fortran-contiguous - column-major order
f_order = arr.reshape(3, 4, order='F')
print("\nFortran-contiguous (column-major):")
print(f_order)
print(f"Flags: C={f_order.flags['C_CONTIGUOUS']}, F={f_order.flags['F_CONTIGUOUS']}")

Output:

C-contiguous (row-major):
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
Flags: C=True, F=False

Fortran-contiguous (column-major):
[[ 0  3  6  9]
 [ 1  4  7 10]
 [ 2  5  8 11]]
Flags: C=False, F=True

The order matters for performance. Operations on contiguous dimensions are faster due to cache locality:

import time

# Large array for timing
large_arr = np.arange(10000000)

# C-order reshaping and row access
c_arr = large_arr.reshape(10000, 1000, order='C')
start = time.perf_counter()
for row in c_arr:
    _ = row.sum()
c_time = time.perf_counter() - start

# F-order reshaping and column access
f_arr = large_arr.reshape(10000, 1000, order='F')
start = time.perf_counter()
for i in range(f_arr.shape[1]):
    _ = f_arr[:, i].sum()
f_time = time.perf_counter() - start

print(f"C-order row access: {c_time:.4f}s")
print(f"F-order column access: {f_time:.4f}s")

Flattening and Unraveling Arrays

Converting multidimensional arrays to 1D is common. NumPy provides flatten() (always copies) and ravel() (returns view when possible).

arr = np.array([[1, 2, 3], [4, 5, 6]])

# flatten() always creates a copy
flat_copy = arr.flatten()
flat_copy[0] = 999
print(f"Original after flatten modification:\n{arr}")

# ravel() returns a view when possible
flat_view = arr.ravel()
flat_view[0] = 999
print(f"\nOriginal after ravel modification:\n{arr}")

# Equivalent to reshape(-1)
flat_reshape = arr.reshape(-1)
print(f"\nShares memory with ravel: {np.shares_memory(flat_view, flat_reshape)}")

Reshaping with newaxis

Adding dimensions without changing data uses np.newaxis or None. This is technically not reshaping but dimension expansion.

arr = np.arange(5)
print(f"Original shape: {arr.shape}")

# Add dimension at different positions
row_vector = arr[np.newaxis, :]
print(f"Row vector shape: {row_vector.shape}")

col_vector = arr[:, np.newaxis]
print(f"Column vector shape: {col_vector.shape}")

# Multiple new axes
expanded = arr[np.newaxis, :, np.newaxis]
print(f"Expanded shape: {expanded.shape}")

# Equivalent using reshape
reshaped_row = arr.reshape(1, -1)
reshaped_col = arr.reshape(-1, 1)
print(f"\nReshape equivalent: {reshaped_row.shape}, {reshaped_col.shape}")

Error Handling and Validation

Reshape fails when dimensions are incompatible. Always validate that the product of new dimensions equals the array size.

arr = np.arange(12)

# Valid reshape
try:
    reshaped = arr.reshape(3, 4)
    print(f"Success: {reshaped.shape}")
except ValueError as e:
    print(f"Error: {e}")

# Invalid reshape - wrong total elements
try:
    reshaped = arr.reshape(3, 5)
except ValueError as e:
    print(f"Error: cannot reshape array of size 12 into shape (3,5)")

# Validation before reshaping
def safe_reshape(arr, new_shape):
    new_size = np.prod([d for d in new_shape if d != -1])
    if arr.size % new_size != 0:
        raise ValueError(f"Cannot reshape size {arr.size} to {new_shape}")
    return arr.reshape(new_shape)

result = safe_reshape(arr, (3, 4))
print(f"Safe reshape: {result.shape}")

Practical Applications

Reshaping is fundamental for batch processing, image manipulation, and data transformation pipelines.

# Image batch processing (batch_size, height, width, channels)
images = np.random.rand(32, 28, 28, 3)
print(f"Image batch shape: {images.shape}")

# Flatten each image for ML model
flattened = images.reshape(32, -1)
print(f"Flattened for model: {flattened.shape}")

# Time series windowing
time_series = np.arange(100)
window_size = 10
stride = 5

# Create sliding windows
num_windows = (len(time_series) - window_size) // stride + 1
windows = np.array([time_series[i:i+window_size] 
                    for i in range(0, len(time_series)-window_size+1, stride)])
print(f"\nWindows shape: {windows.shape}")

# Matrix operations requiring specific shapes
matrix = np.arange(6).reshape(2, 3)
vector = np.arange(3)

# Reshape vector for broadcasting
result = matrix + vector.reshape(1, -1)
print(f"\nBroadcast result shape: {result.shape}")

Reshape operations form the backbone of array manipulation in NumPy. Master the distinction between views and copies, understand memory layout implications, and leverage automatic dimension calculation for robust, efficient code.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.