How to Diagonalize a Matrix in Python
Matrix diagonalization is the process of converting a square matrix into a diagonal matrix through a similarity transformation. Mathematically, a matrix A is diagonalizable if there exists an...
Key Insights
- Matrix diagonalization transforms a matrix into its simplest form A = PDP⁻¹, where D contains eigenvalues and P contains eigenvectors as columns—this dramatically simplifies operations like computing matrix powers or solving differential equations.
- NumPy’s
linalg.eig()returns eigenvalues and eigenvectors directly, but you must verify diagonalizability by checking if eigenvectors are linearly independent; not all matrices can be diagonalized. - For practical applications, diagonalization reduces computing A^n from O(n³) per multiplication to a one-time O(n³) decomposition plus O(n) diagonal element exponentiation, making it orders of magnitude faster for large exponents.
Introduction to Matrix Diagonalization
Matrix diagonalization is the process of converting a square matrix into a diagonal matrix through a similarity transformation. Mathematically, a matrix A is diagonalizable if there exists an invertible matrix P and a diagonal matrix D such that A = PDP⁻¹.
The diagonal matrix D contains the eigenvalues of A along its diagonal, while the columns of P are the corresponding eigenvectors. This transformation is fundamental because diagonal matrices are trivial to work with—raising them to powers, computing exponentials, or solving systems becomes straightforward.
Diagonalization appears throughout applied mathematics and data science. In principal component analysis (PCA), we diagonalize covariance matrices to find principal components. In solving systems of differential equations, diagonalization decouples the equations. For Markov chains, it helps compute long-term state probabilities efficiently.
Here’s what diagonalization looks like for a simple 2×2 matrix:
import numpy as np
# Original matrix
A = np.array([[4, 1],
[2, 3]])
# After diagonalization: A = PDP^(-1)
# D will be: [[5, 0],
# [0, 2]]
# (eigenvalues on diagonal)
Prerequisites and Setup
To diagonalize matrices in Python, you need NumPy for basic linear algebra operations and optionally SciPy for more specialized functions.
import numpy as np
from numpy.linalg import eig, inv
import scipy.linalg as la
Mathematically, diagonalization requires:
- The matrix must be square (n×n)
- The matrix must have n linearly independent eigenvectors
- The eigendecomposition A = PDP⁻¹ must exist
The formula breaks down as:
- A: Original n×n matrix
- P: Matrix whose columns are eigenvectors of A
- D: Diagonal matrix with eigenvalues λ₁, λ₂, …, λₙ
- P⁻¹: Inverse of the eigenvector matrix
Not every matrix is diagonalizable. If a matrix doesn’t have enough linearly independent eigenvectors, you’ll need the Jordan normal form instead.
# Create a sample matrix
A = np.array([[6, -1, 0],
[-1, 5, -1],
[0, -1, 4]], dtype=float)
Diagonalization Using NumPy
NumPy’s linalg.eig() function computes eigenvalues and eigenvectors in one call. The function returns eigenvalues as a 1D array and eigenvectors as columns of a 2D array.
# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = eig(A)
print("Eigenvalues:")
print(eigenvalues)
print("\nEigenvectors (as columns):")
print(eigenvectors)
# Construct diagonal matrix D
D = np.diag(eigenvalues)
print("\nDiagonal matrix D:")
print(D)
# The eigenvector matrix is P
P = eigenvectors
print("\nTransformation matrix P:")
print(P)
# Verify: A = PDP^(-1)
P_inv = inv(P)
A_reconstructed = P @ D @ P_inv
print("\nOriginal A:")
print(A)
print("\nReconstructed A (should match):")
print(A_reconstructed)
# Check if they're equal (within numerical precision)
print("\nAre they equal?", np.allclose(A, A_reconstructed))
The verification step is crucial. Due to floating-point arithmetic, you should use np.allclose() rather than exact equality checks. This function returns True if arrays are element-wise equal within a tolerance.
Handling Special Cases
Not all matrices are diagonalizable. A matrix is defective if it doesn’t have a complete set of linearly independent eigenvectors.
# Diagonalizable matrix
A_diag = np.array([[1, 2],
[2, 1]], dtype=float)
# Non-diagonalizable (defective) matrix
A_defective = np.array([[1, 1],
[0, 1]], dtype=float)
def check_diagonalizable(A):
"""Check if matrix is diagonalizable."""
eigenvalues, eigenvectors = eig(A)
# Check if eigenvectors are linearly independent
# by computing the rank of the eigenvector matrix
rank = np.linalg.matrix_rank(eigenvectors)
n = A.shape[0]
return rank == n
print("A_diag diagonalizable?", check_diagonalizable(A_diag))
print("A_defective diagonalizable?", check_diagonalizable(A_defective))
Symmetric matrices have a special property: they’re always diagonalizable with orthogonal eigenvectors. For symmetric matrices, P becomes an orthogonal matrix (P⁻¹ = Pᵀ), which is numerically more stable.
# Symmetric matrix
A_sym = np.array([[4, 1, 2],
[1, 5, 3],
[2, 3, 6]], dtype=float)
eigenvalues, eigenvectors = eig(A_sym)
# For symmetric matrices, verify orthogonality
print("P^T @ P (should be identity):")
print(eigenvectors.T @ eigenvectors)
Practical Application: Power Computation
One of the most compelling uses of diagonalization is computing matrix powers efficiently. If A = PDP⁻¹, then A^n = PD^nP⁻¹. Since D is diagonal, D^n is trivial—just raise each diagonal element to the nth power.
import time
# Create a test matrix
A = np.array([[3, 1, 0],
[1, 2, 1],
[0, 1, 3]], dtype=float)
n = 100 # Compute A^100
# Method 1: Direct computation (slow)
start = time.time()
A_power_direct = np.linalg.matrix_power(A, n)
time_direct = time.time() - start
# Method 2: Using diagonalization (fast)
start = time.time()
eigenvalues, P = eig(A)
D = np.diag(eigenvalues)
P_inv = inv(P)
# Raise diagonal matrix to power n
D_power = np.diag(eigenvalues ** n)
# Reconstruct A^n
A_power_diag = P @ D_power @ P_inv
time_diag = time.time() - start
print(f"Direct method: {time_direct:.6f} seconds")
print(f"Diagonalization method: {time_diag:.6f} seconds")
print(f"Speedup: {time_direct/time_diag:.2f}x")
print(f"\nResults match: {np.allclose(A_power_direct, A_power_diag)}")
For small matrices and small exponents, the overhead of diagonalization might not be worth it. But for large exponents or when you need multiple powers, diagonalization wins decisively.
Common Pitfalls and Best Practices
Numerical Precision: Floating-point arithmetic introduces small errors. Always use np.allclose() for comparisons rather than exact equality.
# Set custom tolerance for stricter checking
A_reconstructed = P @ D @ P_inv
is_equal = np.allclose(A, A_reconstructed, rtol=1e-10, atol=1e-12)
# Or check the maximum absolute difference
max_error = np.max(np.abs(A - A_reconstructed))
print(f"Maximum reconstruction error: {max_error}")
Complex Eigenvalues: Real matrices can have complex eigenvalues. Handle them appropriately:
# Matrix with complex eigenvalues
A_complex = np.array([[0, -1],
[1, 0]], dtype=float)
eigenvalues, eigenvectors = eig(A_complex)
print("Eigenvalues:", eigenvalues)
print("Are eigenvalues complex?", np.iscomplexobj(eigenvalues))
# If you need real results, check beforehand
if np.iscomplexobj(eigenvalues):
print("Warning: Matrix has complex eigenvalues")
# Consider using Schur decomposition instead
T, Z = la.schur(A_complex)
Numerical Stability: For ill-conditioned matrices, use scipy.linalg.eig() which can be more numerically stable:
from scipy.linalg import eig as scipy_eig
# SciPy version with additional options
eigenvalues, eigenvectors = scipy_eig(A, left=False, right=True)
Sparse Matrices: For large sparse matrices, use specialized routines from scipy.sparse.linalg:
from scipy.sparse.linalg import eigs
from scipy.sparse import csr_matrix
# For large sparse matrices, compute only k eigenvalues
# A_sparse = csr_matrix(large_sparse_array)
# eigenvalues, eigenvectors = eigs(A_sparse, k=10)
Conclusion
Matrix diagonalization in Python is straightforward with NumPy: compute eigenvalues and eigenvectors with linalg.eig(), construct the diagonal matrix D and transformation matrix P, and verify with A = PDP⁻¹.
The workflow is:
- Check if the matrix is square and likely diagonalizable
- Compute eigendecomposition with
eig() - Construct D from eigenvalues and use eigenvectors as P
- Verify reconstruction within numerical tolerance
Diagonalization shines when you need to compute matrix powers, solve differential equations, or perform PCA. It’s most useful when you’ll perform multiple operations on the same matrix or need to raise it to large powers.
For matrices that aren’t diagonalizable, explore the Jordan normal form or Schur decomposition. For large-scale problems, investigate singular value decomposition (SVD) which works for any matrix, not just square ones. Understanding diagonalization provides the foundation for these more advanced techniques.