Two-Dimensional Arrays: Matrix Operations and Traversal

Key Insights

Understanding memory layout (row-major vs column-major) is essential for writing cache-efficient matrix code—poor access patterns can degrade performance by 10x or more.
The staircase search algorithm transforms O(m×n) brute-force searches into O(m+n) operations in sorted matrices, a technique that appears frequently in technical interviews.
Spiral and diagonal traversals follow predictable patterns that, once internalized, become building blocks for solving complex grid-based problems in dynamic programming and image processing.

Introduction to 2D Arrays

Two-dimensional arrays are the workhorse data structure for representing matrices, grids, game boards, and image data. Before diving into operations, you need to understand how they’re stored in memory.

Row-major order (used by C, C++, Python/NumPy, and most languages) stores elements row by row contiguously. A 3x3 matrix occupies memory as: [0,0], [0,1], [0,2], [1,0], [1,1].... Column-major order (used by Fortran, MATLAB, and R) stores column by column instead. This distinction matters enormously for performance—accessing elements in the “wrong” order causes cache misses.

Here’s how to initialize a 3x3 matrix across common languages:

# Python - list of lists
matrix = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]

# NumPy - preferred for numerical work
import numpy as np
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

// Java - true 2D array
int[][] matrix = {
    {1, 2, 3},
    {4, 5, 6},
    {7, 8, 9}
};

// Dynamic initialization
int[][] dynamic = new int[3][3];

// C++ - native array
int matrix[3][3] = {
    {1, 2, 3},
    {4, 5, 6},
    {7, 8, 9}
};

// std::vector for dynamic sizing
std::vector<std::vector<int>> vec(3, std::vector<int>(3, 0));

Basic Traversal Patterns

Row-wise traversal is the natural access pattern for row-major languages. You iterate through each row, then each column within that row. This maximizes cache hits because you’re accessing contiguous memory.

def print_row_wise(matrix):
    """O(m*n) time, O(1) space - cache friendly"""
    rows, cols = len(matrix), len(matrix[0])
    for i in range(rows):
        for j in range(cols):
            print(matrix[i][j], end=" ")
        print()

def print_column_wise(matrix):
    """O(m*n) time, O(1) space - cache unfriendly in row-major languages"""
    rows, cols = len(matrix), len(matrix[0])
    for j in range(cols):
        for i in range(rows):
            print(matrix[i][j], end=" ")
        print()

Both have O(m×n) time complexity, but column-wise traversal performs worse in practice due to strided memory access. For a 1000×1000 matrix, the difference can be 5-10x in execution time.

Common Matrix Operations

Transpose swaps rows and columns—element at position (i, j) moves to (j, i). For non-square matrices, you need a new matrix. For square matrices, you can transpose in-place.

def transpose(matrix):
    """Works for any m x n matrix, returns new matrix"""
    rows, cols = len(matrix), len(matrix[0])
    return [[matrix[i][j] for i in range(rows)] for j in range(cols)]

def transpose_in_place(matrix):
    """Only works for square matrices"""
    n = len(matrix)
    for i in range(n):
        for j in range(i + 1, n):  # Only upper triangle
            matrix[i][j], matrix[j][i] = matrix[j][i], matrix[i][j]

Rotating a matrix 90° clockwise is a common interview question. The trick: transpose first, then reverse each row.

def rotate_90_clockwise(matrix):
    """In-place rotation for square matrix"""
    n = len(matrix)
    
    # Step 1: Transpose
    for i in range(n):
        for j in range(i + 1, n):
            matrix[i][j], matrix[j][i] = matrix[j][i], matrix[i][j]
    
    # Step 2: Reverse each row
    for row in matrix:
        row.reverse()

def rotate_90_counterclockwise(matrix):
    """Transpose then reverse each column"""
    n = len(matrix)
    for i in range(n):
        for j in range(i + 1, n):
            matrix[i][j], matrix[j][i] = matrix[j][i], matrix[i][j]
    matrix.reverse()

For 180° rotation, simply reverse the entire matrix and then reverse each row—or rotate 90° twice.

Diagonal and Spiral Traversal

Diagonal access is straightforward once you see the pattern. The primary diagonal has equal indices (i == j). The secondary diagonal satisfies i + j == n - 1.

def diagonal_sums(matrix):
    """Returns sum of primary and secondary diagonals"""
    n = len(matrix)
    primary = secondary = 0
    
    for i in range(n):
        primary += matrix[i][i]
        secondary += matrix[i][n - 1 - i]
    
    # If n is odd, center element is counted twice
    if n % 2 == 1:
        center = matrix[n // 2][n // 2]
        return primary + secondary - center
    
    return primary + secondary

Spiral traversal is trickier but follows a systematic boundary-shrinking approach. You maintain four boundaries (top, bottom, left, right) and traverse each edge in order.

def spiral_order(matrix):
    """Returns elements in spiral order (clockwise from top-left)"""
    if not matrix:
        return []
    
    result = []
    top, bottom = 0, len(matrix) - 1
    left, right = 0, len(matrix[0]) - 1
    
    while top <= bottom and left <= right:
        # Traverse right along top row
        for col in range(left, right + 1):
            result.append(matrix[top][col])
        top += 1
        
        # Traverse down along right column
        for row in range(top, bottom + 1):
            result.append(matrix[row][right])
        right -= 1
        
        # Traverse left along bottom row (if rows remain)
        if top <= bottom:
            for col in range(right, left - 1, -1):
                result.append(matrix[bottom][col])
            bottom -= 1
        
        # Traverse up along left column (if columns remain)
        if left <= right:
            for row in range(bottom, top - 1, -1):
                result.append(matrix[row][left])
            left += 1
    
    return result

Search Algorithms in 2D Arrays

Brute force search is O(m×n)—check every element. But when the matrix has sorted properties, you can do better.

A row-wise and column-wise sorted matrix has each row sorted left-to-right and each column sorted top-to-bottom. The staircase algorithm exploits this structure.

def staircase_search(matrix, target):
    """
    Search in row-wise and column-wise sorted matrix.
    O(m + n) time complexity.
    Start from top-right corner.
    """
    if not matrix or not matrix[0]:
        return (-1, -1)
    
    rows, cols = len(matrix), len(matrix[0])
    row, col = 0, cols - 1  # Start top-right
    
    while row < rows and col >= 0:
        current = matrix[row][col]
        
        if current == target:
            return (row, col)
        elif current > target:
            col -= 1  # Eliminate current column
        else:
            row += 1  # Eliminate current row
    
    return (-1, -1)  # Not found

The insight: from the top-right corner, going left gives smaller values, going down gives larger values. Each comparison eliminates an entire row or column, yielding O(m+n) complexity. You can also start from the bottom-left corner with inverted logic.

Matrix Multiplication

Matrix multiplication is foundational for graphics, machine learning, and scientific computing. For matrices A (m×n) and B (n×p), the result C is m×p where each element is a dot product.

def matrix_multiply(A, B):
    """
    Standard matrix multiplication: O(m * n * p)
    A is m x n, B is n x p, result is m x p
    """
    m, n = len(A), len(A[0])
    n2, p = len(B), len(B[0])
    
    assert n == n2, "Incompatible dimensions"
    
    # Initialize result matrix with zeros
    C = [[0] * p for _ in range(m)]
    
    for i in range(m):
        for j in range(p):
            for k in range(n):
                C[i][j] += A[i][k] * B[k][j]
    
    return C

The naive algorithm is O(n³) for square matrices. Strassen’s algorithm reduces this to approximately O(n^2.807) by cleverly reducing the number of recursive multiplications from 8 to 7. In practice, libraries like NumPy use highly optimized BLAS implementations that exploit SIMD instructions, cache blocking, and parallelization—don’t roll your own for production numerical work.

Practical Applications and Performance Tips

Image processing represents images as 2D arrays of pixel values. Convolution operations apply kernels (small matrices) across the image for effects like blurring, edge detection, and sharpening.

def apply_kernel(image, kernel):
    """
    Apply convolution kernel to grayscale image.
    Simplified version without padding.
    """
    img_h, img_w = len(image), len(image[0])
    k_h, k_w = len(kernel), len(kernel[0])
    
    # Output dimensions (no padding)
    out_h = img_h - k_h + 1
    out_w = img_w - k_w + 1
    output = [[0] * out_w for _ in range(out_h)]
    
    for i in range(out_h):
        for j in range(out_w):
            total = 0
            for ki in range(k_h):
                for kj in range(k_w):
                    total += image[i + ki][j + kj] * kernel[ki][kj]
            output[i][j] = total
    
    return output

# Edge detection kernel (Sobel)
sobel_x = [[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]]

Game boards (chess, tic-tac-toe, minesweeper) naturally map to 2D arrays. Dynamic programming tables store intermediate results for problems like longest common subsequence, edit distance, and pathfinding.

Performance tips:

Access in row-major order when possible. Loop over the inner index in the innermost loop.
Use contiguous arrays (NumPy, C arrays) over nested lists for numerical work.
Consider cache blocking for large matrices—process in tiles that fit in L1/L2 cache.
Preallocate result matrices rather than building them incrementally.

Two-dimensional arrays are deceptively simple but mastering their traversal patterns and understanding their memory behavior separates competent developers from those who write accidentally slow code. Start with the patterns here, then profile your specific use case—the right optimization depends entirely on your access patterns and data size.