How to Multiply Matrices in Python with NumPy
Matrix multiplication is a fundamental operation in linear algebra where you combine two matrices to produce a third matrix. Unlike simple element-wise operations, matrix multiplication follows...
Key Insights
- NumPy provides three ways to multiply matrices: the
@operator (recommended for Python 3.5+),np.matmul(), andnp.dot(), each with subtle differences in how they handle arrays of different dimensions. - Element-wise multiplication (
*operator) and matrix multiplication (@operator) produce completely different results—confusing them is one of the most common mistakes when working with NumPy arrays. - Matrix multiplication requires compatible dimensions where the number of columns in the first matrix equals the number of rows in the second matrix, otherwise NumPy will raise a ValueError.
Introduction to Matrix Multiplication
Matrix multiplication is a fundamental operation in linear algebra where you combine two matrices to produce a third matrix. Unlike simple element-wise operations, matrix multiplication follows specific rules: each element in the resulting matrix is the dot product of a row from the first matrix and a column from the second matrix.
NumPy has become the de facto standard for matrix operations in Python because it’s built on highly optimized C and Fortran libraries (BLAS and LAPACK). This makes it orders of magnitude faster than pure Python implementations. Whether you’re building neural networks, processing images, or performing statistical analysis, you’ll inevitably need to multiply matrices efficiently.
In machine learning, matrix multiplication powers everything from linear regression to deep neural networks. Image transformations rely on matrix operations for rotations, scaling, and convolutions. Statistical analyses use matrix multiplication for covariance calculations and principal component analysis. Understanding how to perform these operations correctly in NumPy is essential for any data scientist or engineer.
Setting Up NumPy and Creating Matrices
Installing NumPy is straightforward using pip:
pip install numpy
Once installed, you can create matrices (2D arrays) using several methods:
import numpy as np
# Create matrices from lists
A = np.array([[1, 2, 3],
[4, 5, 6]])
B = np.array([[7, 8],
[9, 10],
[11, 12]])
# Create matrices filled with zeros or ones
zeros_matrix = np.zeros((3, 3))
ones_matrix = np.ones((2, 4))
# Create random matrices
random_matrix = np.random.rand(3, 3) # Values between 0 and 1
random_int_matrix = np.random.randint(0, 10, size=(2, 3)) # Random integers
# Check shapes
print(f"A shape: {A.shape}") # Output: (2, 3)
print(f"B shape: {B.shape}") # Output: (3, 2)
Understanding matrix shapes is critical. A shape of (2, 3) means 2 rows and 3 columns. This becomes crucial when performing matrix multiplication, as dimension compatibility determines whether the operation is valid.
Element-wise Multiplication vs. Matrix Multiplication
This is where many developers stumble. The * operator performs element-wise multiplication (also called the Hadamard product), while the @ operator performs true matrix multiplication.
import numpy as np
# Two matrices of the same shape
X = np.array([[1, 2],
[3, 4]])
Y = np.array([[5, 6],
[7, 8]])
# Element-wise multiplication (Hadamard product)
elementwise = X * Y
print("Element-wise multiplication:")
print(elementwise)
# Output:
# [[ 5 12]
# [21 32]]
# Matrix multiplication
matrix_mult = X @ Y
print("\nMatrix multiplication:")
print(matrix_mult)
# Output:
# [[19 22]
# [43 50]]
# Alternative methods for matrix multiplication
matrix_mult_matmul = np.matmul(X, Y)
matrix_mult_dot = np.dot(X, Y)
print(f"\nAll equal: {np.array_equal(matrix_mult, matrix_mult_matmul) and np.array_equal(matrix_mult, matrix_mult_dot)}")
# Output: True
Element-wise multiplication simply multiplies corresponding elements: 1*5=5, 2*6=12, etc. Matrix multiplication follows the mathematical definition: the element at position [i,j] in the result is the dot product of row i from the first matrix and column j from the second matrix.
For the matrix multiplication above, the element at position [0,0] is calculated as: 1*5 + 2*7 = 19.
Performing Matrix Multiplication
The @ operator is the cleanest and most Pythonic way to multiply matrices in Python 3.5 and later:
import numpy as np
# Compatible dimensions: (2,3) @ (3,2) = (2,2)
A = np.array([[1, 2, 3],
[4, 5, 6]])
B = np.array([[7, 8],
[9, 10],
[11, 12]])
result = A @ B
print("Result shape:", result.shape) # (2, 2)
print(result)
# Output:
# [[ 58 64]
# [139 154]]
# Using np.matmul() - functionally equivalent
result_matmul = np.matmul(A, B)
# Using np.dot() - also equivalent for 2D arrays
result_dot = np.dot(A, B)
Matrix multiplication requires that the number of columns in the first matrix equals the number of rows in the second. If you try to multiply incompatible matrices, NumPy will raise an error:
import numpy as np
# Incompatible dimensions: (2,3) and (2,3)
A = np.array([[1, 2, 3],
[4, 5, 6]])
B = np.array([[7, 8, 9],
[10, 11, 12]])
try:
result = A @ B
except ValueError as e:
print(f"Error: {e}")
# Output: Error: matmul: Input operand 1 has a mismatch in its core dimension 0,
# with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 3)
The error message tells you exactly what went wrong: the inner dimensions don’t match.
Advanced Matrix Operations
NumPy handles more complex scenarios elegantly. You can chain multiple matrix multiplications:
import numpy as np
A = np.random.rand(2, 3)
B = np.random.rand(3, 4)
C = np.random.rand(4, 2)
# Chain multiplication
result = A @ B @ C
print("Chained result shape:", result.shape) # (2, 2)
# This is evaluated left-to-right: (A @ B) @ C
For batch operations with 3D arrays, np.matmul() and @ handle broadcasting intelligently:
import numpy as np
# Batch of matrices: 5 batches of (3x4) matrices
batch_A = np.random.rand(5, 3, 4)
# Single (4x2) matrix to multiply with each batch
B = np.random.rand(4, 2)
# Broadcasting: each (3x4) matrix multiplied by (4x2)
result = batch_A @ B
print("Batch result shape:", result.shape) # (5, 3, 2)
# Batch-to-batch multiplication
batch_B = np.random.rand(5, 4, 2)
result_batch = batch_A @ batch_B
print("Batch-to-batch shape:", result_batch.shape) # (5, 3, 2)
This broadcasting capability is powerful for machine learning applications where you need to process multiple samples simultaneously.
Performance Considerations and Best Practices
For modern Python code (3.5+), use the @ operator. It’s readable, concise, and performs identically to np.matmul():
import numpy as np
import timeit
A = np.random.rand(1000, 1000)
B = np.random.rand(1000, 1000)
# Time different methods
time_at = timeit.timeit(lambda: A @ B, number=100)
time_matmul = timeit.timeit(lambda: np.matmul(A, B), number=100)
time_dot = timeit.timeit(lambda: np.dot(A, B), number=100)
print(f"@ operator: {time_at:.4f} seconds")
print(f"np.matmul: {time_matmul:.4f} seconds")
print(f"np.dot: {time_dot:.4f} seconds")
# All three are typically within 1-2% of each other
The performance differences between these methods are negligible for most use cases. The @ operator is preferred because it’s clearer and more maintainable.
For very large matrices, consider these optimization strategies:
import numpy as np
# Use appropriate data types
A = np.random.rand(5000, 5000).astype(np.float32) # 32-bit instead of 64-bit
B = np.random.rand(5000, 5000).astype(np.float32)
# This uses half the memory and is often faster
result = A @ B
# For sparse matrices, use scipy.sparse
from scipy import sparse
# If your matrix is mostly zeros, use sparse representations
sparse_A = sparse.csr_matrix(A)
sparse_B = sparse.csr_matrix(B)
sparse_result = sparse_A @ sparse_B
When working with extremely large matrices that don’t fit in memory, consider chunking your operations or using libraries like Dask that provide out-of-core computation.
One subtle difference: np.dot() behaves differently from np.matmul() with arrays of more than 2 dimensions. For 2D arrays they’re identical, but for higher dimensions, np.dot() performs a sum product over the last axis of the first array and the second-to-last axis of the second array, while np.matmul() performs true matrix multiplication. Stick with @ or np.matmul() for consistency.
Conclusion
Matrix multiplication in NumPy is straightforward once you understand the distinction between element-wise and matrix multiplication. Use the @ operator for clean, readable code. Remember that dimension compatibility is non-negotiable: the inner dimensions must match.
For most applications, the @ operator is your best choice. Use np.matmul() when you need to pass the function as a parameter or want to be explicit. Reserve np.dot() for specific cases where you need its unique behavior with higher-dimensional arrays.
The key takeaways: always verify your matrix shapes before multiplication, use @ for readability, and leverage NumPy’s broadcasting capabilities for batch operations. With these tools, you can handle everything from simple linear algebra to complex machine learning pipelines efficiently.
For more details, consult the official NumPy documentation on array manipulation and linear algebra routines at numpy.org/doc/stable.