How to Calculate the Condition Number of a Matrix in Python
The condition number quantifies how much a matrix amplifies errors during computation. Mathematically, it measures the ratio of the largest to smallest singular values of a matrix, telling you how...
Key Insights
- The condition number measures how sensitive a matrix’s solution is to small changes in input—values above 1000 indicate numerical instability that can cause catastrophic errors in calculations.
- NumPy’s
linalg.cond()computes condition numbers efficiently using different matrix norms (L1, L2, infinity), with L2 being the default and most commonly used in practice. - Always check condition numbers before solving linear systems in production code; ill-conditioned matrices require specialized techniques like regularization or preconditioning to avoid garbage results.
Introduction to Matrix Condition Numbers
The condition number quantifies how much a matrix amplifies errors during computation. Mathematically, it measures the ratio of the largest to smallest singular values of a matrix, telling you how “stretched” the transformation is in different directions.
Why does this matter? When solving linear systems like Ax = b, a high condition number means small changes in A or b produce massive changes in x. This destroys numerical accuracy. In machine learning, regression problems with multicollinear features produce ill-conditioned matrices. In scientific computing, poorly scaled differential equations create the same problem.
A well-conditioned matrix has a condition number close to 1. An ill-conditioned matrix has a condition number of 10^6 or higher. Here’s the difference:
import numpy as np
# Well-conditioned matrix (identity matrix)
A_good = np.array([[1.0, 0.0],
[0.0, 1.0]])
cond_good = np.linalg.cond(A_good)
print(f"Well-conditioned: κ = {cond_good:.2f}") # κ = 1.00
# Ill-conditioned matrix (nearly singular)
A_bad = np.array([[1.0, 1.0],
[1.0, 1.00001]])
cond_bad = np.linalg.cond(A_bad)
print(f"Ill-conditioned: κ = {cond_bad:.2e}") # κ ≈ 2e5
The second matrix is almost singular—its rows are nearly parallel. This creates numerical chaos when you try to invert it or solve systems with it.
Mathematical Foundation
The condition number is defined as:
κ(A) = ||A|| × ||A⁻¹||
where ||·|| represents a matrix norm. Different norms give different condition numbers:
- L1 norm: Maximum absolute column sum
- L2 norm: Largest singular value (spectral norm)
- Infinity norm: Maximum absolute row sum
The L2 condition number relates directly to eigenvalues. For symmetric positive definite matrices:
κ₂(A) = λ_max / λ_min
This is the ratio of largest to smallest eigenvalues. Here’s a manual calculation:
import numpy as np
A = np.array([[4.0, 1.0],
[1.0, 3.0]])
# Method 1: Using definition with L2 norm
A_norm = np.linalg.norm(A, ord=2)
A_inv = np.linalg.inv(A)
A_inv_norm = np.linalg.norm(A_inv, ord=2)
cond_manual = A_norm * A_inv_norm
print(f"Manual calculation: κ = {cond_manual:.4f}")
# Method 2: Using eigenvalues (for symmetric matrices)
eigenvalues = np.linalg.eigvalsh(A)
cond_eigen = np.max(eigenvalues) / np.min(eigenvalues)
print(f"From eigenvalues: κ = {cond_eigen:.4f}")
# Method 3: NumPy built-in
cond_numpy = np.linalg.cond(A)
print(f"NumPy function: κ = {cond_numpy:.4f}")
All three methods should give approximately the same result for symmetric matrices. The eigenvalue approach is computationally cheaper but only works for specific matrix types.
Using NumPy’s Built-in Functions
NumPy’s linalg.cond() is the practical tool you’ll use. It accepts different norm specifications:
import numpy as np
A = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9.1]]) # Slightly perturbed to avoid exact singularity
# Different norms
cond_1 = np.linalg.cond(A, p=1) # L1 norm
cond_2 = np.linalg.cond(A, p=2) # L2 norm (default)
cond_inf = np.linalg.cond(A, p=np.inf) # Infinity norm
cond_fro = np.linalg.cond(A, p='fro') # Frobenius norm
print(f"L1 norm: κ = {cond_1:.2e}")
print(f"L2 norm: κ = {cond_2:.2e}")
print(f"Infinity norm: κ = {cond_inf:.2e}")
print(f"Frobenius norm: κ = {cond_fro:.2e}")
For singular or near-singular matrices, the condition number approaches infinity. NumPy handles this gracefully:
# Exactly singular matrix
A_singular = np.array([[1, 2],
[2, 4]]) # Second row = 2 × first row
cond_singular = np.linalg.cond(A_singular)
print(f"Singular matrix: κ = {cond_singular}") # inf
# Check if matrix is problematic
if np.isinf(cond_singular) or cond_singular > 1e10:
print("WARNING: Matrix is singular or severely ill-conditioned")
Practical Examples and Interpretation
Let’s see condition numbers in action with different matrix types:
import numpy as np
# Example 1: Well-conditioned (scaled identity)
A_well = np.diag([1.0, 2.0, 3.0])
print(f"Diagonal matrix: κ = {np.linalg.cond(A_well):.2f}") # κ = 3.00
# Example 2: Moderately conditioned
A_moderate = np.array([[10, 7, 8, 7],
[7, 5, 6, 5],
[8, 6, 10, 9],
[7, 5, 9, 10]])
print(f"Moderate matrix: κ = {np.linalg.cond(A_moderate):.2f}") # κ ≈ 30
# Example 3: Ill-conditioned (Hilbert matrix)
def hilbert(n):
return np.fromfunction(lambda i, j: 1.0 / (i + j + 1), (n, n))
A_ill = hilbert(5)
print(f"Hilbert matrix: κ = {np.linalg.cond(A_ill):.2e}") # κ ≈ 5e5
The Hilbert matrix is famously ill-conditioned. Even small Hilbert matrices have enormous condition numbers, making them terrible for numerical computation.
Here’s a real-world scenario—multicollinearity in linear regression:
import numpy as np
from sklearn.datasets import make_regression
from sklearn.preprocessing import StandardScaler
# Create regression data with multicollinearity
X, y = make_regression(n_samples=100, n_features=5, noise=10, random_state=42)
# Add a nearly duplicate feature
X = np.column_stack([X, X[:, 0] + np.random.normal(0, 0.01, 100)])
# Standardize
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# Check condition number
X_cond = np.linalg.cond(X_scaled.T @ X_scaled)
print(f"Design matrix condition number: κ = {X_cond:.2e}")
if X_cond > 1000:
print("WARNING: Severe multicollinearity detected!")
print("Consider: ridge regression, feature selection, or PCA")
# Compare with well-conditioned data
X_good, y_good = make_regression(n_samples=100, n_features=5, noise=10, random_state=42)
X_good_scaled = scaler.fit_transform(X_good)
X_good_cond = np.linalg.cond(X_good_scaled.T @ X_good_scaled)
print(f"Good design matrix: κ = {X_good_cond:.2f}")
When κ > 1000, your regression coefficients become unreliable. Small changes in data produce wildly different coefficient estimates.
Performance Considerations and Edge Cases
Computing condition numbers is expensive—O(n³) for dense n×n matrices because it requires singular value decomposition. For large matrices, this becomes prohibitive:
import numpy as np
import time
sizes = [100, 500, 1000, 2000]
for n in sizes:
A = np.random.randn(n, n)
start = time.time()
cond = np.linalg.cond(A)
elapsed = time.time() - start
print(f"n={n:4d}: {elapsed:.4f} seconds (κ = {cond:.2e})")
For sparse matrices, use SciPy’s sparse linear algebra:
from scipy import sparse
from scipy.sparse.linalg import norm as sparse_norm
import numpy as np
# Create sparse matrix
n = 1000
A_sparse = sparse.random(n, n, density=0.01, format='csr')
# For sparse matrices, estimate condition number
# (exact computation is too expensive)
A_dense_sample = A_sparse[:100, :100].toarray()
cond_estimate = np.linalg.cond(A_dense_sample)
print(f"Estimated condition number: κ ≈ {cond_estimate:.2e}")
# Or use iterative methods for largest/smallest singular values
from scipy.sparse.linalg import svds
try:
# Get largest and smallest singular values
s_max = svds(A_sparse, k=1, which='LM', return_singular_vectors=False)[0]
s_min = svds(A_sparse, k=1, which='SM', return_singular_vectors=False)[0]
cond_sparse = s_max / s_min
print(f"Sparse matrix condition: κ = {cond_sparse:.2e}")
except:
print("Matrix too ill-conditioned for iterative SVD")
For non-square matrices, the condition number uses the pseudoinverse:
# Rectangular matrix
A_rect = np.random.randn(100, 50)
cond_rect = np.linalg.cond(A_rect)
print(f"Rectangular matrix: κ = {cond_rect:.2e}")
# This uses SVD: κ = σ_max / σ_min
Conclusion and Best Practices
Check condition numbers whenever you:
- Solve linear systems in production code
- Perform matrix inversions
- Fit regression models with many features
- Work with discretized differential equations
Use these thresholds as guidelines:
- κ < 10: Excellent numerical stability
- κ < 100: Good, no concerns
- κ < 1000: Acceptable for most applications
- κ < 10⁶: Problematic, results may be inaccurate
- κ > 10⁶: Severe issues, expect garbage output
When you encounter ill-conditioned matrices, don’t just compute anyway. Use these alternatives:
- Regularization: Add ridge penalty (Tikhonov regularization)
- Preconditioning: Transform the system to improve conditioning
- Iterative refinement: Use higher precision for intermediate steps
- Feature selection: Remove redundant variables
- Rescaling: Normalize columns to similar magnitudes
The condition number is your early warning system for numerical disasters. Ignore it at your peril—a condition number of 10¹⁵ means you’re losing all 15 digits of double precision. Your solution is pure noise. Always check, always act on the results.