NumPy - Cholesky Decomposition

Cholesky decomposition transforms a symmetric positive definite matrix A into the product of a lower triangular matrix L and its transpose: A = L·L^T. This factorization is unique when A is positive...

Key Insights

  • Cholesky decomposition factors a positive definite matrix into L·L^T, where L is lower triangular, providing a computationally efficient alternative to LU decomposition with half the operations
  • NumPy’s numpy.linalg.cholesky() performs the decomposition in O(n³/3) time, making it ideal for solving linear systems, generating correlated random variables, and optimizing matrix operations in machine learning
  • The decomposition requires strict positive definiteness—matrices must be symmetric with positive eigenvalues—and fails on ill-conditioned or near-singular matrices without proper numerical handling

Understanding Cholesky Decomposition

Cholesky decomposition transforms a symmetric positive definite matrix A into the product of a lower triangular matrix L and its transpose: A = L·L^T. This factorization is unique when A is positive definite and provides significant computational advantages over general matrix factorizations.

import numpy as np

# Create a positive definite matrix
A = np.array([[4, 2, 1],
              [2, 5, 3],
              [1, 3, 6]], dtype=float)

# Perform Cholesky decomposition
L = np.linalg.cholesky(A)

print("Original matrix A:")
print(A)
print("\nLower triangular matrix L:")
print(L)
print("\nVerification L @ L.T:")
print(L @ L.T)

The decomposition requires exactly half the operations of LU decomposition because it exploits the symmetry of the input matrix. For an n×n matrix, Cholesky needs approximately n³/3 floating-point operations compared to 2n³/3 for LU.

Solving Linear Systems

Cholesky decomposition excels at solving systems of linear equations Ax = b when A is positive definite. The process involves two triangular solves instead of one general matrix inversion.

# Solve Ax = b using Cholesky decomposition
A = np.array([[4, 2, 1],
              [2, 5, 3],
              [1, 3, 6]], dtype=float)
b = np.array([7, 13, 10], dtype=float)

# Decompose A = L @ L.T
L = np.linalg.cholesky(A)

# Solve L @ y = b (forward substitution)
y = np.linalg.solve(L, b)

# Solve L.T @ x = y (backward substitution)
x = np.linalg.solve(L.T, y)

print("Solution x:", x)
print("Verification A @ x:", A @ x)
print("Expected b:", b)
print("Error:", np.linalg.norm(A @ x - b))

This approach is numerically stable and faster than direct inversion. For repeated solves with the same matrix but different right-hand sides, compute L once and reuse it:

# Multiple right-hand sides
L = np.linalg.cholesky(A)
b_vectors = np.array([[7, 13, 10],
                      [1, 2, 3],
                      [5, 8, 11]]).T

solutions = []
for b in b_vectors.T:
    y = np.linalg.solve(L, b)
    x = np.linalg.solve(L.T, y)
    solutions.append(x)

solutions = np.array(solutions)
print("Solutions:\n", solutions)

Generating Correlated Random Variables

Cholesky decomposition generates multivariate normal distributions with specified covariance structures, essential for Monte Carlo simulations and financial modeling.

# Generate correlated random variables
np.random.seed(42)

# Desired covariance matrix
cov_matrix = np.array([[1.0, 0.6, 0.3],
                       [0.6, 1.0, 0.5],
                       [0.3, 0.5, 1.0]])

# Cholesky decomposition of covariance
L = np.linalg.cholesky(cov_matrix)

# Generate uncorrelated standard normal samples
n_samples = 1000
uncorrelated = np.random.standard_normal((3, n_samples))

# Transform to correlated samples
correlated = L @ uncorrelated

# Verify the covariance structure
empirical_cov = np.cov(correlated)
print("Target covariance:\n", cov_matrix)
print("\nEmpirical covariance:\n", empirical_cov)
print("\nCovariance error:\n", np.abs(cov_matrix - empirical_cov))

This technique works because if Z ~ N(0, I) and Y = LZ, then Cov(Y) = L·Cov(Z)·L^T = L·I·L^T = LL^T = Σ, where Σ is the target covariance matrix.

Matrix Inversion and Determinants

Cholesky decomposition simplifies computing matrix inverses and determinants for positive definite matrices.

# Compute inverse using Cholesky
A = np.array([[4, 2, 1],
              [2, 5, 3],
              [1, 3, 6]], dtype=float)

L = np.linalg.cholesky(A)

# Inverse of A = (L.T)^-1 @ L^-1
L_inv = np.linalg.inv(L)
A_inv = L_inv.T @ L_inv

print("A inverse via Cholesky:\n", A_inv)
print("Direct inverse:\n", np.linalg.inv(A))
print("Verification A @ A_inv:\n", A @ A_inv)

# Determinant: det(A) = det(L)^2 = (product of diagonal elements)^2
det_A = np.prod(np.diag(L)) ** 2
print("\nDeterminant via Cholesky:", det_A)
print("Direct determinant:", np.linalg.det(A))

The determinant calculation is numerically stable because it only involves multiplication of diagonal elements, avoiding the subtraction operations that cause numerical cancellation in direct methods.

Handling Numerical Issues

Real-world matrices often have numerical issues. Implementing robust checks prevents silent failures.

def safe_cholesky(A, epsilon=1e-10):
    """Perform Cholesky with numerical checks."""
    # Check symmetry
    if not np.allclose(A, A.T):
        raise ValueError("Matrix is not symmetric")
    
    # Check positive definiteness via eigenvalues
    eigenvalues = np.linalg.eigvalsh(A)
    if np.any(eigenvalues < -epsilon):
        raise ValueError(f"Matrix is not positive definite. Min eigenvalue: {eigenvalues.min()}")
    
    # Add regularization for near-singular matrices
    if eigenvalues.min() < epsilon:
        print(f"Warning: Matrix is ill-conditioned. Adding regularization.")
        A = A + epsilon * np.eye(A.shape[0])
    
    try:
        return np.linalg.cholesky(A)
    except np.linalg.LinAlgError as e:
        raise ValueError(f"Cholesky decomposition failed: {e}")

# Test with ill-conditioned matrix
A_ill = np.array([[1.0, 0.999],
                  [0.999, 1.0]])

L = safe_cholesky(A_ill)
print("Regularized decomposition successful")
print("Condition number:", np.linalg.cond(A_ill))

Application: Gaussian Process Regression

Cholesky decomposition is fundamental to Gaussian process computations, where covariance matrices are inherently positive definite.

def gaussian_process_predict(X_train, y_train, X_test, kernel_func, noise=1e-6):
    """Simple GP regression using Cholesky decomposition."""
    n_train = X_train.shape[0]
    
    # Compute kernel matrices
    K = kernel_func(X_train, X_train) + noise * np.eye(n_train)
    K_star = kernel_func(X_test, X_train)
    
    # Cholesky decomposition of K
    L = np.linalg.cholesky(K)
    
    # Solve for alpha: K @ alpha = y
    alpha = np.linalg.solve(L.T, np.linalg.solve(L, y_train))
    
    # Predictive mean
    mu = K_star @ alpha
    
    return mu

# RBF kernel
def rbf_kernel(X1, X2, length_scale=1.0):
    """Radial basis function kernel."""
    X1_sq = np.sum(X1**2, axis=1).reshape(-1, 1)
    X2_sq = np.sum(X2**2, axis=1).reshape(1, -1)
    distances_sq = X1_sq + X2_sq - 2 * X1 @ X2.T
    return np.exp(-distances_sq / (2 * length_scale**2))

# Generate test data
X_train = np.linspace(0, 10, 20).reshape(-1, 1)
y_train = np.sin(X_train).ravel() + np.random.normal(0, 0.1, 20)
X_test = np.linspace(0, 10, 100).reshape(-1, 1)

predictions = gaussian_process_predict(X_train, y_train, X_test, rbf_kernel)
print(f"Predictions shape: {predictions.shape}")
print(f"Mean prediction: {predictions.mean():.4f}")

Performance Considerations

Cholesky decomposition’s efficiency makes it preferable for large positive definite systems.

import time

# Compare solving methods
sizes = [100, 500, 1000]

for n in sizes:
    # Generate random positive definite matrix
    A_rand = np.random.randn(n, n)
    A = A_rand.T @ A_rand + np.eye(n)
    b = np.random.randn(n)
    
    # Method 1: Direct solve
    start = time.time()
    x1 = np.linalg.solve(A, b)
    time_direct = time.time() - start
    
    # Method 2: Cholesky solve
    start = time.time()
    L = np.linalg.cholesky(A)
    y = np.linalg.solve(L, b)
    x2 = np.linalg.solve(L.T, y)
    time_cholesky = time.time() - start
    
    print(f"\nSize {n}x{n}:")
    print(f"  Direct solve: {time_direct*1000:.2f} ms")
    print(f"  Cholesky solve: {time_cholesky*1000:.2f} ms")
    print(f"  Speedup: {time_direct/time_cholesky:.2f}x")
    print(f"  Solution error: {np.linalg.norm(x1 - x2):.2e}")

Cholesky decomposition provides both theoretical elegance and practical performance benefits for positive definite matrices. Its applications span numerical linear algebra, statistics, optimization, and machine learning, making it an essential tool for computational applications requiring efficient matrix operations.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.