NumPy - Create Diagonal Array (np.diag)

• `np.diag()` serves dual purposes: extracting diagonals from 2D arrays and constructing diagonal matrices from 1D arrays, making it essential for linear algebra operations

Key Insights

np.diag() serves dual purposes: extracting diagonals from 2D arrays and constructing diagonal matrices from 1D arrays, making it essential for linear algebra operations • The k parameter controls diagonal offset, enabling extraction and construction of super-diagonals (k>0) and sub-diagonals (k<0) beyond the main diagonal • Understanding np.diag() behavior with different input dimensions prevents common bugs in matrix operations, particularly when working with eigenvalue decomposition and covariance matrices

Basic Diagonal Array Construction

The np.diag() function creates a 2D array with specified values along the diagonal and zeros elsewhere. Pass a 1D array to construct a square matrix:

import numpy as np

# Create diagonal matrix from 1D array
values = np.array([1, 2, 3, 4])
diagonal_matrix = np.diag(values)
print(diagonal_matrix)

Output:

[[1 0 0 0]
 [0 2 0 0]
 [0 0 3 0]
 [0 0 0 4]]

This operation is fundamental for creating identity matrices, scaling transformations, and diagonal covariance matrices in machine learning applications:

# Create identity matrix
identity = np.diag(np.ones(5))

# Create scaling matrix
scale_factors = np.array([2.0, 3.0, 1.5])
scaling_matrix = np.diag(scale_factors)

# Create diagonal covariance matrix (assumes independence)
variances = np.array([0.5, 1.2, 0.8, 2.1])
covariance_matrix = np.diag(variances)
print(covariance_matrix)

Extracting Diagonals from Existing Arrays

When you pass a 2D array to np.diag(), it extracts the diagonal elements as a 1D array:

# Create a sample matrix
matrix = np.array([
    [10, 20, 30],
    [40, 50, 60],
    [70, 80, 90]
])

# Extract main diagonal
main_diagonal = np.diag(matrix)
print(main_diagonal)  # [10 50 90]

This extraction is crucial for operations like computing matrix traces, extracting eigenvalues from diagonal matrices, and analyzing correlation matrices:

# Compute trace (sum of diagonal elements)
trace = np.diag(matrix).sum()
print(f"Trace: {trace}")  # 150

# Extract diagonal from correlation matrix
correlation_matrix = np.array([
    [1.00, 0.85, 0.62],
    [0.85, 1.00, 0.73],
    [0.62, 0.73, 1.00]
])
diagonal = np.diag(correlation_matrix)
print(diagonal)  # [1. 1. 1.]

Working with Offset Diagonals

The k parameter specifies which diagonal to extract or construct. Positive values indicate super-diagonals (above main), negative values indicate sub-diagonals (below main):

# Create matrix with offset diagonals
values = np.array([5, 6, 7])

# Main diagonal (k=0, default)
main = np.diag(values, k=0)

# First super-diagonal (k=1)
super_diag = np.diag(values, k=1)

# First sub-diagonal (k=-1)
sub_diag = np.diag(values, k=-1)

print("Super-diagonal (k=1):")
print(super_diag)
print("\nSub-diagonal (k=-1):")
print(sub_diag)

Output:

Super-diagonal (k=1):
[[0 5 0 0]
 [0 0 6 0]
 [0 0 0 7]
 [0 0 0 0]]

Sub-diagonal (k=-1):
[[0 0 0 0]
 [5 0 0 0]
 [0 6 0 0]
 [0 0 7 0]]

Extract offset diagonals from existing matrices:

matrix = np.array([
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
    [13, 14, 15, 16]
])

# Extract different diagonals
main = np.diag(matrix, k=0)      # [1, 6, 11, 16]
super1 = np.diag(matrix, k=1)    # [2, 7, 12]
super2 = np.diag(matrix, k=2)    # [3, 8]
sub1 = np.diag(matrix, k=-1)     # [5, 10, 15]

print(f"Main diagonal: {main}")
print(f"First super-diagonal: {super1}")
print(f"Second super-diagonal: {super2}")
print(f"First sub-diagonal: {sub1}")

Constructing Tridiagonal Matrices

Combining multiple np.diag() calls creates tridiagonal matrices, common in numerical methods for solving differential equations:

n = 5

# Create tridiagonal matrix
main_diag = np.diag(np.full(n, 2))
upper_diag = np.diag(np.full(n-1, -1), k=1)
lower_diag = np.diag(np.full(n-1, -1), k=-1)

tridiagonal = main_diag + upper_diag + lower_diag
print(tridiagonal)

Output:

[[ 2 -1  0  0  0]
 [-1  2 -1  0  0]
 [ 0 -1  2 -1  0]
 [ 0  0 -1  2 -1]
 [ 0  0  0 -1  2]]

This pattern appears in finite difference methods:

def create_finite_difference_matrix(n, dx):
    """Create second derivative finite difference matrix"""
    main = np.diag(np.full(n, -2.0 / dx**2))
    off_diag = np.diag(np.full(n-1, 1.0 / dx**2), k=1)
    off_diag += np.diag(np.full(n-1, 1.0 / dx**2), k=-1)
    return main + off_diag

# Create matrix for 10 grid points with spacing 0.1
A = create_finite_difference_matrix(10, 0.1)
print(A[:5, :5])  # Show first 5x5 block

Practical Applications

Diagonal matrix multiplication optimizes computational efficiency since most elements are zero:

# Inefficient: full matrix multiplication
D = np.diag([2, 3, 4])
A = np.random.rand(3, 3)
result_full = D @ A

# Efficient: element-wise multiplication using diagonal values
d_values = np.diag(D)
result_efficient = d_values[:, np.newaxis] * A

print(np.allclose(result_full, result_efficient))  # True

Creating block diagonal matrices for independent subsystems:

from scipy.linalg import block_diag

# Alternative using np.diag for simple cases
def create_block_diagonal(*blocks):
    """Create block diagonal from multiple arrays"""
    # For demonstration with np.diag concepts
    sizes = [len(b) for b in blocks]
    total_size = sum(sizes)
    result = np.zeros((total_size, total_size))
    
    offset = 0
    for block in blocks:
        size = len(block)
        result[offset:offset+size, offset:offset+size] = np.diag(block)
        offset += size
    
    return result

block1 = np.array([1, 2])
block2 = np.array([3, 4, 5])
block_matrix = create_block_diagonal(block1, block2)
print(block_matrix)

Eigenvalue decomposition verification:

# For diagonal matrices, diagonal elements are eigenvalues
D = np.diag([5, 3, 8, 1])
eigenvalues, eigenvectors = np.linalg.eig(D)

print("Diagonal elements:", np.diag(D))
print("Eigenvalues:", np.sort(eigenvalues))
print("Match:", np.allclose(np.sort(np.diag(D)), np.sort(eigenvalues)))

Performance Considerations

np.diag() creates new arrays rather than views, which matters for large matrices:

import time

# Large matrix diagonal extraction
large_matrix = np.random.rand(5000, 5000)

start = time.time()
diag_copy = np.diag(large_matrix)
time_diag = time.time() - start

# Alternative using diagonal view
start = time.time()
diag_view = np.diagonal(large_matrix)
time_diagonal = time.time() - start

print(f"np.diag time: {time_diag:.4f}s")
print(f"np.diagonal time: {time_diagonal:.4f}s")

For read-only operations, np.diagonal() provides a view without copying data. Use np.diag() when you need to construct new matrices or when the dual behavior (extract/construct) simplifies your code logic.

Understanding these nuances ensures efficient memory usage in production systems handling large-scale matrix operations.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.