NumPy - Eigenvalues and Eigenvectors (np.linalg.eig)

Key Insights

Eigenvalues and eigenvectors reveal fundamental properties of linear transformations, with np.linalg.eig() computing both simultaneously for square matrices in a single operation
Understanding the mathematical relationship Av = λv is essential for interpreting results, where eigenvectors represent invariant directions and eigenvalues represent scaling factors
Real-world applications include principal component analysis (PCA), stability analysis in differential equations, and Google’s PageRank algorithm, all leveraging eigendecomposition for dimensionality reduction and system characterization

Understanding Eigenvalues and Eigenvectors

An eigenvector of a square matrix A is a non-zero vector v that, when multiplied by A, results in a scalar multiple of itself. This scalar is the corresponding eigenvalue λ. Mathematically: Av = λv.

NumPy’s np.linalg.eig() function computes both eigenvalues and eigenvectors efficiently using optimized LAPACK routines. The function returns a tuple containing eigenvalues as a 1D array and eigenvectors as columns in a 2D array.

import numpy as np

# Define a 2x2 matrix
A = np.array([[4, 2],
              [1, 3]])

# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

print("Matrix A:")
print(A)
print("\nEigenvalues:", eigenvalues)
print("\nEigenvectors:")
print(eigenvectors)

Output:

Matrix A:
[[4 2]
 [1 3]]

Eigenvalues: [5. 2.]

Eigenvectors:
[[ 0.89442719 -0.70710678]
 [ 0.4472136   0.70710678]]

Verifying the Eigenvalue Equation

To validate results, multiply the matrix by each eigenvector and compare it to the eigenvalue times the eigenvector:

# Verify first eigenvalue and eigenvector
v1 = eigenvectors[:, 0]
lambda1 = eigenvalues[0]

Av1 = A @ v1
lambda_v1 = lambda1 * v1

print("A @ v1:", Av1)
print("λ1 * v1:", lambda_v1)
print("Close?", np.allclose(Av1, lambda_v1))

# Verify second eigenvalue and eigenvector
v2 = eigenvectors[:, 1]
lambda2 = eigenvalues[1]

print("\nA @ v2:", A @ v2)
print("λ2 * v2:", lambda2 * v2)
print("Close?", np.allclose(A @ v2, lambda2 * v2))

The np.allclose() function accounts for floating-point precision errors, returning True when values match within tolerance.

Complex Eigenvalues

Non-symmetric matrices can produce complex eigenvalues and eigenvectors. NumPy handles this automatically by returning complex-valued arrays:

# Rotation matrix (90 degrees counterclockwise)
R = np.array([[0, -1],
              [1, 0]])

eigenvalues, eigenvectors = np.linalg.eig(R)

print("Rotation matrix eigenvalues:", eigenvalues)
print("Rotation matrix eigenvectors:")
print(eigenvectors)

Output shows complex conjugate pairs:

Rotation matrix eigenvalues: [0.+1.j 0.-1.j]

These complex eigenvalues indicate rotation without scaling. The magnitude of complex eigenvalues (computed with np.abs()) represents the scaling factor, while the argument represents rotation angle.

Symmetric Matrices and Real Eigenvalues

Symmetric matrices (where A = A^T) always have real eigenvalues and orthogonal eigenvectors. For symmetric matrices, use np.linalg.eigh() for better performance:

# Symmetric matrix
S = np.array([[6, 2, 1],
              [2, 3, 1],
              [1, 1, 1]])

# Standard eigendecomposition
eigenvalues_eig, eigenvectors_eig = np.linalg.eig(S)

# Optimized for symmetric/Hermitian matrices
eigenvalues_eigh, eigenvectors_eigh = np.linalg.eigh(S)

print("Eigenvalues (eig):", eigenvalues_eig)
print("Eigenvalues (eigh):", eigenvalues_eigh)

# Verify orthogonality of eigenvectors
print("\nEigenvector orthogonality check:")
print(eigenvectors_eigh.T @ eigenvectors_eigh)

The eigh() function guarantees real eigenvalues and returns them in ascending order, making it ideal for symmetric problems.

Principal Component Analysis Implementation

PCA uses eigendecomposition of the covariance matrix to find principal components—directions of maximum variance:

# Generate sample data
np.random.seed(42)
X = np.random.randn(100, 3)
X[:, 1] = X[:, 0] * 2 + np.random.randn(100) * 0.5  # Correlated feature

# Center the data
X_centered = X - np.mean(X, axis=0)

# Compute covariance matrix
cov_matrix = np.cov(X_centered.T)

# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eigh(cov_matrix)

# Sort by eigenvalues (descending)
idx = eigenvalues.argsort()[::-1]
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]

# Explained variance ratio
explained_variance = eigenvalues / np.sum(eigenvalues)

print("Explained variance ratio:", explained_variance)
print("\nPrincipal components (eigenvectors):")
print(eigenvectors)

# Project data onto first 2 principal components
X_pca = X_centered @ eigenvectors[:, :2]
print("\nReduced data shape:", X_pca.shape)

The eigenvectors become principal components, and eigenvalues indicate variance along each component. This enables dimensionality reduction while preserving maximum variance.

Matrix Diagonalization

Eigendecomposition allows matrix diagonalization: A = QΛQ^(-1), where Q contains eigenvectors and Λ is a diagonal matrix of eigenvalues:

A = np.array([[3, 1],
              [1, 3]])

eigenvalues, eigenvectors = np.linalg.eig(A)

# Construct diagonal matrix of eigenvalues
Lambda = np.diag(eigenvalues)

# Reconstruct original matrix
A_reconstructed = eigenvectors @ Lambda @ np.linalg.inv(eigenvectors)

print("Original matrix:")
print(A)
print("\nReconstructed matrix:")
print(A_reconstructed)
print("\nReconstruction accurate?", np.allclose(A, A_reconstructed))

This decomposition simplifies matrix operations. Computing A^n becomes straightforward: A^n = QΛ^nQ^(-1), where Λ^n is trivial to compute.

Power Iteration for Dominant Eigenvalue

For large sparse matrices, computing the dominant eigenvalue (largest magnitude) using power iteration is more efficient:

def power_iteration(A, num_iterations=100):
    # Random initial vector
    v = np.random.rand(A.shape[1])
    
    for _ in range(num_iterations):
        # Multiply by matrix
        v_new = A @ v
        # Normalize
        v_new = v_new / np.linalg.norm(v_new)
        v = v_new
    
    # Compute eigenvalue (Rayleigh quotient)
    eigenvalue = (v @ A @ v) / (v @ v)
    
    return eigenvalue, v

A = np.array([[4, 1, 2],
              [1, 3, 1],
              [2, 1, 4]])

# Power iteration
dominant_eigenvalue, dominant_eigenvector = power_iteration(A)

# Compare with np.linalg.eig
eigenvalues, eigenvectors = np.linalg.eig(A)
max_idx = np.argmax(np.abs(eigenvalues))

print("Power iteration eigenvalue:", dominant_eigenvalue)
print("NumPy eigenvalue:", eigenvalues[max_idx])
print("\nPower iteration eigenvector:", dominant_eigenvector)
print("NumPy eigenvector:", eigenvectors[:, max_idx])

Power iteration converges to the dominant eigenvector, making it useful for algorithms like PageRank where only the principal eigenvector matters.

Stability Analysis with Eigenvalues

Eigenvalues determine system stability in differential equations. For a system dx/dt = Ax, the system is stable if all eigenvalues have negative real parts:

# System matrix for coupled differential equations
A = np.array([[-2, 1],
              [1, -2]])

eigenvalues, _ = np.linalg.eig(A)

print("System eigenvalues:", eigenvalues)
print("Real parts:", eigenvalues.real)
print("System stable?", np.all(eigenvalues.real < 0))

# Unstable system
A_unstable = np.array([[1, 2],
                       [2, 1]])

eigenvalues_unstable, _ = np.linalg.eig(A_unstable)
print("\nUnstable system eigenvalues:", eigenvalues_unstable)
print("System stable?", np.all(eigenvalues_unstable.real < 0))

Positive eigenvalues indicate exponential growth, while negative eigenvalues indicate decay. This analysis is fundamental in control theory and dynamical systems.

Performance Considerations

For large matrices, eigendecomposition is computationally expensive (O(n³)). Use specialized functions when applicable:

import time

# Large symmetric matrix
n = 1000
S = np.random.randn(n, n)
S = (S + S.T) / 2  # Make symmetric

# Time eig
start = time.time()
eigenvalues_eig, _ = np.linalg.eig(S)
time_eig = time.time() - start

# Time eigh
start = time.time()
eigenvalues_eigh, _ = np.linalg.eigh(S)
time_eigh = time.time() - start

print(f"eig time: {time_eig:.4f}s")
print(f"eigh time: {time_eigh:.4f}s")
print(f"Speedup: {time_eig/time_eigh:.2f}x")

For sparse matrices, use scipy.sparse.linalg.eigs() or eigsh() to compute only a subset of eigenvalues, dramatically reducing computational cost for high-dimensional problems.