NumPy - Delete Elements (np.delete)

The `np.delete()` function removes specified entries from an array along a given axis. The function signature is:

Key Insights

  • np.delete() removes elements along a specified axis by index position, returning a new array without modifying the original
  • Delete operations on multi-dimensional arrays require explicit axis specification to control whether you’re removing rows, columns, or individual elements
  • For performance-critical code with repeated deletions, consider boolean masking or np.where() as faster alternatives to multiple np.delete() calls

Basic Syntax and Parameters

The np.delete() function removes specified entries from an array along a given axis. The function signature is:

numpy.delete(arr, obj, axis=None)
  • arr: Input array
  • obj: Index, slice, or array of indices indicating which elements to remove
  • axis: Axis along which to delete (None flattens the array first)
import numpy as np

# Simple 1D array deletion
arr = np.array([10, 20, 30, 40, 50])
result = np.delete(arr, 2)
print(result)  # [10 20 40 50]

# Original array unchanged
print(arr)  # [10 20 30 40 50]

# Delete multiple indices
result = np.delete(arr, [0, 2, 4])
print(result)  # [20 40]

# Delete using slice
result = np.delete(arr, slice(1, 4))
print(result)  # [10 50]

Deleting from Multi-Dimensional Arrays

When working with 2D or higher-dimensional arrays, the axis parameter determines which dimension to operate on.

# 2D array operations
matrix = np.array([[1, 2, 3, 4],
                   [5, 6, 7, 8],
                   [9, 10, 11, 12]])

# Delete row (axis=0)
result = np.delete(matrix, 1, axis=0)
print(result)
# [[ 1  2  3  4]
#  [ 9 10 11 12]]

# Delete column (axis=1)
result = np.delete(matrix, [1, 3], axis=1)
print(result)
# [[ 1  3]
#  [ 5  7]
#  [ 9 11]]

# Delete without axis (flattens first)
result = np.delete(matrix, [0, 5, 11])
print(result)  # [ 2  3  4  5  7  8  9 10 11]

Advanced Index Selection

You can combine boolean conditions with np.where() to identify indices for deletion, enabling conditional removal of elements.

# Delete based on condition
data = np.array([15, 23, 8, 42, 16, 4, 35])

# Remove elements greater than 20
indices_to_delete = np.where(data > 20)[0]
result = np.delete(data, indices_to_delete)
print(result)  # [15  8 16  4]

# Delete even-indexed positions
even_indices = np.arange(0, len(data), 2)
result = np.delete(data, even_indices)
print(result)  # [23 42  4]

# Complex condition: remove values between 10 and 25
mask = (data >= 10) & (data <= 25)
indices = np.where(mask)[0]
result = np.delete(data, indices)
print(result)  # [8 42 4 35]

Working with 3D Arrays

Higher-dimensional arrays follow the same axis logic, but require careful attention to which dimension you’re modifying.

# 3D array: (depth, rows, columns)
cube = np.arange(24).reshape(2, 3, 4)
print("Original shape:", cube.shape)  # (2, 3, 4)

# Delete along depth (axis=0)
result = np.delete(cube, 0, axis=0)
print("After axis=0 delete:", result.shape)  # (1, 3, 4)

# Delete along rows (axis=1)
result = np.delete(cube, [0, 2], axis=1)
print("After axis=1 delete:", result.shape)  # (2, 1, 4)

# Delete along columns (axis=2)
result = np.delete(cube, slice(1, 3), axis=2)
print("After axis=2 delete:", result.shape)  # (2, 3, 2)
print(result)
# [[[ 0  3]
#   [ 4  7]
#   [ 8 11]]
#  [[12 15]
#   [16 19]
#   [20 23]]]

Performance Considerations

np.delete() creates a new array, which can be expensive for large datasets or repeated operations. Boolean indexing often provides better performance.

import time

# Setup large array
large_array = np.random.randint(0, 100, size=1000000)

# Method 1: np.delete()
start = time.time()
indices_to_remove = np.where(large_array > 50)[0]
result1 = np.delete(large_array, indices_to_remove)
time1 = time.time() - start

# Method 2: Boolean masking
start = time.time()
result2 = large_array[large_array <= 50]
time2 = time.time() - start

print(f"np.delete() time: {time1:.4f}s")
print(f"Boolean mask time: {time2:.4f}s")
print(f"Speedup: {time1/time2:.2f}x")
# Boolean masking typically 2-3x faster

Practical Use Cases

Removing Outliers from Dataset

# Remove statistical outliers using IQR method
data = np.array([12, 15, 14, 13, 100, 16, 15, 14, 2, 13])

Q1 = np.percentile(data, 25)
Q3 = np.percentile(data, 75)
IQR = Q3 - Q1

lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR

outlier_indices = np.where((data < lower_bound) | (data > upper_bound))[0]
cleaned_data = np.delete(data, outlier_indices)

print(f"Original: {data}")
print(f"Cleaned: {cleaned_data}")
print(f"Removed {len(outlier_indices)} outliers")

Removing Empty or Invalid Rows from Matrix

# Dataset with missing values (represented as -999)
dataset = np.array([[1.2, 3.4, 5.6],
                    [-999, 2.3, 4.5],
                    [2.1, 3.2, 4.3],
                    [1.5, -999, 3.7],
                    [3.3, 4.4, 5.5]])

# Find rows containing invalid values
invalid_rows = np.where(np.any(dataset == -999, axis=1))[0]
clean_dataset = np.delete(dataset, invalid_rows, axis=0)

print("Clean dataset shape:", clean_dataset.shape)
print(clean_dataset)
# [[1.2 3.4 5.6]
#  [2.1 3.2 4.3]
#  [3.3 4.4 5.5]]

Time Series Data Filtering

# Remove weekends from time series data
dates = np.arange('2024-01-01', '2024-01-15', dtype='datetime64[D]')
values = np.random.randn(14)

# Get day of week (0=Monday, 6=Sunday)
weekdays = (dates.astype('datetime64[D]').view('int64') - 4) % 7

# Remove Saturday (5) and Sunday (6)
weekend_indices = np.where(weekdays >= 5)[0]
weekday_dates = np.delete(dates, weekend_indices)
weekday_values = np.delete(values, weekend_indices)

print(f"Original: {len(dates)} days")
print(f"Weekdays only: {len(weekday_dates)} days")

Common Pitfalls

# Pitfall 1: Deleting in loop modifies indices
arr = np.array([1, 2, 3, 4, 5])
indices_to_remove = [1, 3]

# WRONG: Indices shift after first deletion
# for idx in indices_to_remove:
#     arr = np.delete(arr, idx)

# CORRECT: Delete all at once
arr = np.delete(arr, indices_to_remove)

# Pitfall 2: Forgetting axis parameter
matrix = np.array([[1, 2], [3, 4], [5, 6]])
# This flattens then deletes
result = np.delete(matrix, 1)  # Returns [1 3 4 5 6]
# Specify axis to delete row
result = np.delete(matrix, 1, axis=0)  # Returns [[1 2], [5 6]]

# Pitfall 3: Assuming in-place modification
original = np.array([1, 2, 3])
np.delete(original, 1)  # Returns new array
print(original)  # Still [1 2 3], unchanged

The np.delete() function provides a clean interface for removing array elements, but understanding its behavior with different dimensions and performance characteristics ensures you choose the right tool for each situation.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.