NumPy - Singular Value Decomposition (SVD)

Key Insights

SVD decomposes any matrix A into three matrices (U, Σ, V^T) where U and V are orthogonal and Σ is diagonal, enabling dimensionality reduction, noise filtering, and solving linear systems
Unlike eigendecomposition which only works on square matrices, SVD applies to any m×n matrix and provides numerical stability for rank-deficient or ill-conditioned matrices
Practical applications include image compression (reducing storage by 90%+), recommender systems (collaborative filtering), and principal component analysis for high-dimensional data

Understanding SVD Fundamentals

Singular Value Decomposition factorizes an m×n matrix A into three component matrices:

A = UΣV^T

Where:

U is an m×m orthogonal matrix (left singular vectors)
Σ is an m×n diagonal matrix (singular values in descending order)
V^T is an n×n orthogonal matrix (right singular vectors)

import numpy as np

# Create a sample matrix
A = np.array([[3, 1, 1],
              [-1, 3, 1],
              [1, 1, 3],
              [1, -1, 1]])

# Perform SVD
U, s, Vt = np.linalg.svd(A, full_matrices=True)

print(f"Original shape: {A.shape}")
print(f"U shape: {U.shape}")
print(f"s shape: {s.shape}")
print(f"Vt shape: {Vt.shape}")

# Reconstruct Σ matrix
Sigma = np.zeros((A.shape[0], A.shape[1]))
Sigma[:A.shape[1], :A.shape[1]] = np.diag(s)

# Verify reconstruction
A_reconstructed = U @ Sigma @ Vt
print(f"\nReconstruction error: {np.linalg.norm(A - A_reconstructed)}")

Note that np.linalg.svd returns singular values as a 1D array, not a matrix. You must construct the diagonal matrix manually for full reconstruction.

Low-Rank Approximation and Compression

SVD enables optimal low-rank approximations. By keeping only the k largest singular values, you minimize the Frobenius norm of the approximation error.

def low_rank_approximation(A, k):
    """
    Approximate matrix A using only k singular values.
    
    Args:
        A: Input matrix (m x n)
        k: Number of singular values to retain
        
    Returns:
        Approximated matrix
    """
    U, s, Vt = np.linalg.svd(A, full_matrices=False)
    
    # Zero out all but the k largest singular values
    s[k:] = 0
    
    # Reconstruct
    Sigma_k = np.diag(s)
    A_k = U @ Sigma_k @ Vt
    
    return A_k, s

# Example with random data
np.random.seed(42)
A = np.random.randn(100, 50)

for k in [5, 10, 20]:
    A_approx, singular_values = low_rank_approximation(A, k)
    error = np.linalg.norm(A - A_approx, 'fro')
    compression_ratio = (k * (A.shape[0] + A.shape[1])) / (A.shape[0] * A.shape[1])
    
    print(f"k={k}: Error={error:.4f}, Compression={compression_ratio:.2%}")

This technique is fundamental for data compression. Instead of storing m×n values, you store k(m+n)+k values, where k « min(m,n).

Image Compression with SVD

SVD provides lossy image compression by treating grayscale images as matrices.

from PIL import Image
import matplotlib.pyplot as plt

def compress_image(image_path, k_values):
    """
    Compress a grayscale image using SVD with different ranks.
    
    Args:
        image_path: Path to image file
        k_values: List of ranks to test
    """
    # Load and convert to grayscale
    img = Image.open(image_path).convert('L')
    img_array = np.array(img, dtype=np.float64)
    
    # Perform SVD
    U, s, Vt = np.linalg.svd(img_array, full_matrices=False)
    
    results = {}
    for k in k_values:
        # Reconstruct using k components
        S_k = np.diag(s[:k])
        compressed = U[:, :k] @ S_k @ Vt[:k, :]
        
        # Calculate metrics
        original_size = img_array.size * 8  # bits
        compressed_size = (U[:, :k].size + k + Vt[:k, :].size) * 8
        compression_ratio = compressed_size / original_size
        mse = np.mean((img_array - compressed) ** 2)
        psnr = 10 * np.log10(255**2 / mse)
        
        results[k] = {
            'image': compressed,
            'compression_ratio': compression_ratio,
            'psnr': psnr
        }
        
        print(f"k={k}: Compression={compression_ratio:.2%}, PSNR={psnr:.2f}dB")
    
    return results

# Usage example (requires actual image file)
# results = compress_image('sample.jpg', [10, 25, 50, 100])

For a 512×512 image, using k=50 typically achieves 90%+ compression while maintaining visual quality (PSNR > 30dB).

Solving Linear Systems and Pseudoinverse

SVD provides the most numerically stable method for computing the Moore-Penrose pseudoinverse, essential for solving overdetermined or underdetermined linear systems.

def solve_with_svd(A, b, rcond=1e-15):
    """
    Solve Ax = b using SVD, handling singular/rectangular matrices.
    
    Args:
        A: Coefficient matrix (m x n)
        b: Right-hand side (m,)
        rcond: Cutoff for small singular values
        
    Returns:
        x: Solution vector
        residual: Residual norm
    """
    U, s, Vt = np.linalg.svd(A, full_matrices=False)
    
    # Filter small singular values
    threshold = rcond * s[0]
    s_inv = np.where(s > threshold, 1/s, 0)
    
    # Compute pseudoinverse: A+ = V @ S+ @ U^T
    A_pinv = Vt.T @ np.diag(s_inv) @ U.T
    
    # Solve
    x = A_pinv @ b
    residual = np.linalg.norm(A @ x - b)
    
    return x, residual

# Overdetermined system (more equations than unknowns)
A = np.array([[1, 2],
              [3, 4],
              [5, 6],
              [7, 8]])
b = np.array([1, 2, 3, 4])

x, res = solve_with_svd(A, b)
print(f"Solution: {x}")
print(f"Residual: {res:.6f}")

# Compare with NumPy's lstsq
x_lstsq, residuals, rank, s = np.linalg.lstsq(A, b, rcond=None)
print(f"NumPy lstsq solution: {x_lstsq}")
print(f"Difference: {np.linalg.norm(x - x_lstsq):.10f}")

Principal Component Analysis

SVD is the computational backbone of PCA, providing a more numerically stable alternative to eigendecomposition of the covariance matrix.

def pca_with_svd(X, n_components):
    """
    Perform PCA using SVD.
    
    Args:
        X: Data matrix (n_samples x n_features)
        n_components: Number of principal components
        
    Returns:
        X_transformed: Transformed data
        components: Principal components
        explained_variance: Variance explained by each component
    """
    # Center the data
    X_centered = X - np.mean(X, axis=0)
    
    # Perform SVD
    U, s, Vt = np.linalg.svd(X_centered, full_matrices=False)
    
    # Principal components are rows of Vt
    components = Vt[:n_components]
    
    # Transform data
    X_transformed = X_centered @ components.T
    
    # Explained variance
    explained_variance = (s ** 2) / (X.shape[0] - 1)
    explained_variance_ratio = explained_variance / explained_variance.sum()
    
    return X_transformed, components, explained_variance_ratio[:n_components]

# Generate sample data
np.random.seed(42)
X = np.random.randn(100, 5)
X[:, 1] = X[:, 0] * 2 + np.random.randn(100) * 0.1  # Correlated feature

# Apply PCA
X_pca, components, var_ratio = pca_with_svd(X, n_components=3)

print(f"Transformed shape: {X_pca.shape}")
print(f"Explained variance ratio: {var_ratio}")
print(f"Cumulative variance: {np.cumsum(var_ratio)}")

Condition Number and Numerical Stability

SVD provides insight into matrix conditioning, critical for understanding numerical stability in computations.

def analyze_conditioning(A):
    """
    Analyze matrix conditioning using SVD.
    
    Args:
        A: Input matrix
        
    Returns:
        Dictionary with conditioning metrics
    """
    U, s, Vt = np.linalg.svd(A)
    
    # Condition number
    cond = s[0] / s[-1] if s[-1] > 0 else np.inf
    
    # Effective rank (singular values > tolerance)
    tol = s[0] * max(A.shape) * np.finfo(float).eps
    rank = np.sum(s > tol)
    
    # Null space dimension
    null_space_dim = A.shape[1] - rank
    
    return {
        'condition_number': cond,
        'rank': rank,
        'null_space_dim': null_space_dim,
        'singular_values': s,
        'log_singular_values': np.log10(s + 1e-16)
    }

# Well-conditioned matrix
A_good = np.diag([10, 5, 3, 1])
print("Well-conditioned matrix:")
print(analyze_conditioning(A_good))

# Ill-conditioned matrix
A_bad = np.array([[1, 1], [1, 1.0001]])
print("\nIll-conditioned matrix:")
print(analyze_conditioning(A_bad))

When the condition number exceeds 10^15, expect significant numerical errors. SVD-based methods handle ill-conditioned matrices better than direct methods like Gaussian elimination.

Performance Considerations

SVD computation is O(min(mn^2, m^2n)) for an m×n matrix. Use full_matrices=False when you don’t need the complete orthogonal matrices.

import time

def benchmark_svd(sizes):
    """Compare full vs reduced SVD performance."""
    for m, n in sizes:
        A = np.random.randn(m, n)
        
        # Full SVD
        start = time.time()
        U_full, s_full, Vt_full = np.linalg.svd(A, full_matrices=True)
        time_full = time.time() - start
        
        # Reduced SVD
        start = time.time()
        U_red, s_red, Vt_red = np.linalg.svd(A, full_matrices=False)
        time_reduced = time.time() - start
        
        print(f"{m}×{n}: Full={time_full:.4f}s, Reduced={time_reduced:.4f}s, "
              f"Speedup={time_full/time_reduced:.2f}x")

benchmark_svd([(500, 100), (1000, 200), (2000, 400)])

For large-scale problems, consider randomized SVD algorithms (scipy.sparse.linalg.svds) which compute only the k largest singular values in O(mnk) time.