NumPy - Create Diagonal Array (np.diag)
• `np.diag()` serves dual purposes: extracting diagonals from 2D arrays and constructing diagonal matrices from 1D arrays, making it essential for linear algebra operations
Key Insights
• np.diag() serves dual purposes: extracting diagonals from 2D arrays and constructing diagonal matrices from 1D arrays, making it essential for linear algebra operations
• The k parameter controls diagonal offset, enabling extraction and construction of super-diagonals (k>0) and sub-diagonals (k<0) beyond the main diagonal
• Understanding np.diag() behavior with different input dimensions prevents common bugs in matrix operations, particularly when working with eigenvalue decomposition and covariance matrices
Basic Diagonal Array Construction
The np.diag() function creates a 2D array with specified values along the diagonal and zeros elsewhere. Pass a 1D array to construct a square matrix:
import numpy as np
# Create diagonal matrix from 1D array
values = np.array([1, 2, 3, 4])
diagonal_matrix = np.diag(values)
print(diagonal_matrix)
Output:
[[1 0 0 0]
[0 2 0 0]
[0 0 3 0]
[0 0 0 4]]
This operation is fundamental for creating identity matrices, scaling transformations, and diagonal covariance matrices in machine learning applications:
# Create identity matrix
identity = np.diag(np.ones(5))
# Create scaling matrix
scale_factors = np.array([2.0, 3.0, 1.5])
scaling_matrix = np.diag(scale_factors)
# Create diagonal covariance matrix (assumes independence)
variances = np.array([0.5, 1.2, 0.8, 2.1])
covariance_matrix = np.diag(variances)
print(covariance_matrix)
Extracting Diagonals from Existing Arrays
When you pass a 2D array to np.diag(), it extracts the diagonal elements as a 1D array:
# Create a sample matrix
matrix = np.array([
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
])
# Extract main diagonal
main_diagonal = np.diag(matrix)
print(main_diagonal) # [10 50 90]
This extraction is crucial for operations like computing matrix traces, extracting eigenvalues from diagonal matrices, and analyzing correlation matrices:
# Compute trace (sum of diagonal elements)
trace = np.diag(matrix).sum()
print(f"Trace: {trace}") # 150
# Extract diagonal from correlation matrix
correlation_matrix = np.array([
[1.00, 0.85, 0.62],
[0.85, 1.00, 0.73],
[0.62, 0.73, 1.00]
])
diagonal = np.diag(correlation_matrix)
print(diagonal) # [1. 1. 1.]
Working with Offset Diagonals
The k parameter specifies which diagonal to extract or construct. Positive values indicate super-diagonals (above main), negative values indicate sub-diagonals (below main):
# Create matrix with offset diagonals
values = np.array([5, 6, 7])
# Main diagonal (k=0, default)
main = np.diag(values, k=0)
# First super-diagonal (k=1)
super_diag = np.diag(values, k=1)
# First sub-diagonal (k=-1)
sub_diag = np.diag(values, k=-1)
print("Super-diagonal (k=1):")
print(super_diag)
print("\nSub-diagonal (k=-1):")
print(sub_diag)
Output:
Super-diagonal (k=1):
[[0 5 0 0]
[0 0 6 0]
[0 0 0 7]
[0 0 0 0]]
Sub-diagonal (k=-1):
[[0 0 0 0]
[5 0 0 0]
[0 6 0 0]
[0 0 7 0]]
Extract offset diagonals from existing matrices:
matrix = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]
])
# Extract different diagonals
main = np.diag(matrix, k=0) # [1, 6, 11, 16]
super1 = np.diag(matrix, k=1) # [2, 7, 12]
super2 = np.diag(matrix, k=2) # [3, 8]
sub1 = np.diag(matrix, k=-1) # [5, 10, 15]
print(f"Main diagonal: {main}")
print(f"First super-diagonal: {super1}")
print(f"Second super-diagonal: {super2}")
print(f"First sub-diagonal: {sub1}")
Constructing Tridiagonal Matrices
Combining multiple np.diag() calls creates tridiagonal matrices, common in numerical methods for solving differential equations:
n = 5
# Create tridiagonal matrix
main_diag = np.diag(np.full(n, 2))
upper_diag = np.diag(np.full(n-1, -1), k=1)
lower_diag = np.diag(np.full(n-1, -1), k=-1)
tridiagonal = main_diag + upper_diag + lower_diag
print(tridiagonal)
Output:
[[ 2 -1 0 0 0]
[-1 2 -1 0 0]
[ 0 -1 2 -1 0]
[ 0 0 -1 2 -1]
[ 0 0 0 -1 2]]
This pattern appears in finite difference methods:
def create_finite_difference_matrix(n, dx):
"""Create second derivative finite difference matrix"""
main = np.diag(np.full(n, -2.0 / dx**2))
off_diag = np.diag(np.full(n-1, 1.0 / dx**2), k=1)
off_diag += np.diag(np.full(n-1, 1.0 / dx**2), k=-1)
return main + off_diag
# Create matrix for 10 grid points with spacing 0.1
A = create_finite_difference_matrix(10, 0.1)
print(A[:5, :5]) # Show first 5x5 block
Practical Applications
Diagonal matrix multiplication optimizes computational efficiency since most elements are zero:
# Inefficient: full matrix multiplication
D = np.diag([2, 3, 4])
A = np.random.rand(3, 3)
result_full = D @ A
# Efficient: element-wise multiplication using diagonal values
d_values = np.diag(D)
result_efficient = d_values[:, np.newaxis] * A
print(np.allclose(result_full, result_efficient)) # True
Creating block diagonal matrices for independent subsystems:
from scipy.linalg import block_diag
# Alternative using np.diag for simple cases
def create_block_diagonal(*blocks):
"""Create block diagonal from multiple arrays"""
# For demonstration with np.diag concepts
sizes = [len(b) for b in blocks]
total_size = sum(sizes)
result = np.zeros((total_size, total_size))
offset = 0
for block in blocks:
size = len(block)
result[offset:offset+size, offset:offset+size] = np.diag(block)
offset += size
return result
block1 = np.array([1, 2])
block2 = np.array([3, 4, 5])
block_matrix = create_block_diagonal(block1, block2)
print(block_matrix)
Eigenvalue decomposition verification:
# For diagonal matrices, diagonal elements are eigenvalues
D = np.diag([5, 3, 8, 1])
eigenvalues, eigenvectors = np.linalg.eig(D)
print("Diagonal elements:", np.diag(D))
print("Eigenvalues:", np.sort(eigenvalues))
print("Match:", np.allclose(np.sort(np.diag(D)), np.sort(eigenvalues)))
Performance Considerations
np.diag() creates new arrays rather than views, which matters for large matrices:
import time
# Large matrix diagonal extraction
large_matrix = np.random.rand(5000, 5000)
start = time.time()
diag_copy = np.diag(large_matrix)
time_diag = time.time() - start
# Alternative using diagonal view
start = time.time()
diag_view = np.diagonal(large_matrix)
time_diagonal = time.time() - start
print(f"np.diag time: {time_diag:.4f}s")
print(f"np.diagonal time: {time_diagonal:.4f}s")
For read-only operations, np.diagonal() provides a view without copying data. Use np.diag() when you need to construct new matrices or when the dual behavior (extract/construct) simplifies your code logic.
Understanding these nuances ensures efficient memory usage in production systems handling large-scale matrix operations.