How to Perform SVD in NumPy
Singular Value Decomposition (SVD) is one of the most useful matrix factorization techniques in applied mathematics and machine learning. It takes any matrix—regardless of shape—and breaks it down...
Key Insights
- NumPy’s
numpy.linalg.svd()function decomposes any matrix into three components (U, Σ, V^T), enabling powerful applications from image compression to recommendation systems. - Truncated SVD keeps only the top-k singular values, allowing you to approximate matrices with dramatically reduced storage while preserving most of the important information.
- For large sparse matrices, switch to
scipy.sparse.linalg.svds()to avoid memory issues—full SVD on a 100,000 × 10,000 matrix would require over 7GB just for the U matrix alone.
Introduction to SVD
Singular Value Decomposition (SVD) is one of the most useful matrix factorization techniques in applied mathematics and machine learning. It takes any matrix—regardless of shape—and breaks it down into three simpler matrices that reveal the underlying structure of your data.
The applications are everywhere. Netflix uses SVD-based methods in recommendation systems to find latent factors connecting users and movies. Data scientists use it for dimensionality reduction when PCA feels too limiting. Image processing pipelines leverage SVD for compression, noise reduction, and feature extraction. Even natural language processing relies on SVD through techniques like Latent Semantic Analysis.
NumPy makes SVD accessible with a single function call. But understanding what’s happening under the hood—and knowing when to use which parameters—separates effective implementations from naive ones.
Understanding the SVD Components
SVD factorizes a matrix A (with dimensions m × n) into three matrices:
A = U × Σ × V^T
Here’s what each component represents:
U (m × m): The left singular vectors. Each column represents a direction in the input space. These are orthonormal—perpendicular to each other with unit length.
Σ (m × n): A diagonal matrix containing singular values in descending order. These values indicate the “importance” or “strength” of each corresponding direction. Larger singular values capture more variance in the data.
V^T (n × n): The right singular vectors (transposed). Each row represents a direction in the output space. Also orthonormal.
Think of it this way: U tells you how to combine your original rows, V^T tells you how to combine your original columns, and Σ tells you how much weight each combination gets.
The singular values in Σ are always non-negative and conventionally sorted from largest to smallest. This ordering is what makes truncated SVD work—you can discard the smallest singular values (and their corresponding vectors) while retaining most of the matrix’s information.
Basic SVD with numpy.linalg.svd()
NumPy’s SVD implementation lives in the linear algebra module. The function signature is straightforward:
import numpy as np
# Create a sample matrix
A = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
# Perform SVD
U, S, Vt = np.linalg.svd(A)
print(f"Original matrix A shape: {A.shape}")
print(f"U shape: {U.shape}")
print(f"S shape: {S.shape}")
print(f"Vt shape: {Vt.shape}")
print(f"\nU:\n{U}")
print(f"\nSingular values: {S}")
print(f"\nV^T:\n{Vt}")
Output:
Original matrix A shape: (3, 3)
U shape: (3, 3)
S shape: (3,)
Vt shape: (3, 3)
U:
[[-0.21483724 0.88723069 0.40824829]
[-0.52058739 0.24964395 -0.81649658]
[-0.82633754 -0.38794278 0.40824829]]
Singular values: [1.68481034e+01 1.06836951e+00 3.33475287e-16]
V^T:
[[-0.47967118 -0.57236779 -0.66506441]
[-0.77669099 -0.07568647 0.62531805]
[-0.40824829 0.81649658 -0.40824829]]
Notice that NumPy returns S as a 1D array of singular values, not as a diagonal matrix. This saves memory but means you’ll need to reconstruct the diagonal matrix yourself when needed.
Two key parameters control the output:
full_matrices(default: True): When False, returns reduced matrices (m × k and k × n where k = min(m, n)), saving memory for rectangular matrices.compute_uv(default: True): When False, only returns singular values. Useful when you just need to analyze the spectrum.
# Reduced SVD for a rectangular matrix
B = np.random.rand(5, 3)
U_reduced, S, Vt_reduced = np.linalg.svd(B, full_matrices=False)
print(f"Full U would be: (5, 5), Reduced U: {U_reduced.shape}")
print(f"Full Vt would be: (3, 3), Reduced Vt: {Vt_reduced.shape}")
Reconstructing the Original Matrix
Verification matters. Always confirm your decomposition by reconstructing the original matrix:
import numpy as np
A = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
], dtype=float)
U, S, Vt = np.linalg.svd(A)
# Reconstruct: need to convert S to a diagonal matrix
Sigma = np.diag(S)
# Matrix multiplication: U @ Sigma @ Vt
A_reconstructed = U @ Sigma @ Vt
print(f"Original:\n{A}")
print(f"\nReconstructed:\n{A_reconstructed}")
print(f"\nReconstruction matches: {np.allclose(A, A_reconstructed)}")
For rectangular matrices with full_matrices=False, reconstruction is simpler:
# Rectangular matrix example
B = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]
], dtype=float)
U, S, Vt = np.linalg.svd(B, full_matrices=False)
# With reduced matrices, just use np.diag directly
B_reconstructed = U @ np.diag(S) @ Vt
print(f"Shapes - U: {U.shape}, S: {S.shape}, Vt: {Vt.shape}")
print(f"Reconstruction matches: {np.allclose(B, B_reconstructed)}")
Truncated SVD for Dimensionality Reduction
The real power of SVD emerges when you keep only the top-k singular values. Since singular values are sorted by magnitude, the first few capture most of the matrix’s “energy.”
import numpy as np
# Create a matrix with some structure
np.random.seed(42)
A = np.random.rand(10, 8) @ np.random.rand(8, 6) # Rank at most 6
A += 0.1 * np.random.rand(10, 6) # Add noise
U, S, Vt = np.linalg.svd(A, full_matrices=False)
print(f"Singular values: {S}")
print(f"Cumulative energy: {np.cumsum(S**2) / np.sum(S**2)}")
# Rank-2 approximation
k = 2
U_k = U[:, :k]
S_k = S[:k]
Vt_k = Vt[:k, :]
A_approx = U_k @ np.diag(S_k) @ Vt_k
# Calculate reconstruction error
error = np.linalg.norm(A - A_approx, 'fro') / np.linalg.norm(A, 'fro')
print(f"\nRank-{k} approximation")
print(f"Relative Frobenius error: {error:.4f}")
print(f"Storage: original {A.size} values, compressed {U_k.size + S_k.size + Vt_k.size} values")
# Compare with rank-4 approximation
k = 4
A_approx_4 = U[:, :k] @ np.diag(S[:k]) @ Vt[:k, :]
error_4 = np.linalg.norm(A - A_approx_4, 'fro') / np.linalg.norm(A, 'fro')
print(f"\nRank-{k} approximation")
print(f"Relative Frobenius error: {error_4:.4f}")
The Eckart-Young theorem guarantees that this truncated SVD gives the best rank-k approximation in both Frobenius and spectral norms. You can’t do better with any other method.
Practical Application: Image Compression
Image compression demonstrates SVD’s practical value. A grayscale image is just a matrix of pixel intensities—perfect for SVD.
import numpy as np
import matplotlib.pyplot as plt
# Create a synthetic grayscale image (or load your own)
# Using a gradient with some patterns
x = np.linspace(0, 4*np.pi, 256)
y = np.linspace(0, 4*np.pi, 256)
X, Y = np.meshgrid(x, y)
image = (np.sin(X) * np.cos(Y) + 1) * 127.5 # Values 0-255
# Perform SVD
U, S, Vt = np.linalg.svd(image, full_matrices=False)
def compress_image(U, S, Vt, k):
"""Reconstruct image using top-k singular values."""
return U[:, :k] @ np.diag(S[:k]) @ Vt[:k, :]
def compression_ratio(original_shape, k):
"""Calculate compression ratio."""
m, n = original_shape
original_size = m * n
compressed_size = k * (m + n + 1)
return original_size / compressed_size
# Test different compression levels
k_values = [5, 20, 50, 100]
fig, axes = plt.subplots(2, 3, figsize=(12, 8))
axes = axes.flatten()
# Original image
axes[0].imshow(image, cmap='gray')
axes[0].set_title(f'Original\n{image.shape[0]}×{image.shape[1]} = {image.size} values')
axes[0].axis('off')
# Singular value spectrum
axes[1].semilogy(S)
axes[1].set_xlabel('Index')
axes[1].set_ylabel('Singular Value (log scale)')
axes[1].set_title('Singular Value Spectrum')
axes[1].axhline(y=S[50], color='r', linestyle='--', alpha=0.5)
# Compressed versions
for idx, k in enumerate(k_values):
compressed = compress_image(U, S, Vt, k)
ratio = compression_ratio(image.shape, k)
error = np.linalg.norm(image - compressed) / np.linalg.norm(image)
axes[idx + 2].imshow(compressed, cmap='gray')
axes[idx + 2].set_title(f'k={k}\nRatio: {ratio:.1f}x, Error: {error:.3f}')
axes[idx + 2].axis('off')
plt.tight_layout()
plt.savefig('svd_compression.png', dpi=150)
plt.show()
# Print detailed statistics
print("\nCompression Statistics:")
print("-" * 50)
for k in k_values:
compressed = compress_image(U, S, Vt, k)
ratio = compression_ratio(image.shape, k)
error = np.linalg.norm(image - compressed) / np.linalg.norm(image)
storage = k * (image.shape[0] + image.shape[1] + 1)
print(f"k={k:3d}: {ratio:5.1f}x compression, {error:.4f} error, {storage:6d} values stored")
With k=50 on a 256×256 image, you store only 25,650 values instead of 65,536—a 2.5x compression—while retaining over 99% of the image information for structured images.
Performance Tips and Common Pitfalls
Memory is the bottleneck. Full SVD on an m × n matrix creates a U matrix of size m × m. For a 100,000 × 10,000 matrix, that’s 80 billion elements—roughly 640GB in float64. Always use full_matrices=False for rectangular matrices.
Use sparse SVD for large, sparse matrices. When your matrix has mostly zeros, scipy.sparse.linalg.svds() computes only the top-k singular values without materializing dense matrices:
from scipy.sparse import random as sparse_random
from scipy.sparse.linalg import svds
import numpy as np
# Create a large sparse matrix
sparse_matrix = sparse_random(10000, 5000, density=0.01, format='csr')
# Compute only top 10 singular values/vectors
U, S, Vt = svds(sparse_matrix, k=10)
# Note: svds returns singular values in ascending order
# Reverse to match numpy.linalg.svd convention
U = U[:, ::-1]
S = S[::-1]
Vt = Vt[::-1, :]
print(f"Sparse matrix shape: {sparse_matrix.shape}")
print(f"Non-zero elements: {sparse_matrix.nnz} ({100*sparse_matrix.nnz/np.prod(sparse_matrix.shape):.2f}%)")
print(f"Top 10 singular values: {S}")
Watch for numerical instability. Matrices with singular values spanning many orders of magnitude can cause precision issues. Check your condition number (np.linalg.cond(A)) before trusting reconstruction results.
Don’t compute what you don’t need. If you only need singular values for analysis, use compute_uv=False:
# Fast: only computes singular values
S = np.linalg.svd(large_matrix, compute_uv=False)
SVD is computationally expensive—O(min(mn², m²n)) for an m × n matrix. For iterative algorithms that repeatedly need low-rank approximations, consider randomized SVD implementations in scikit-learn (sklearn.utils.extmath.randomized_svd) which trade some accuracy for significant speed gains.