Hungarian Algorithm: Assignment Problem

Key Insights

The Hungarian Algorithm solves the assignment problem in O(n³) time, making it dramatically more efficient than the O(n!) brute-force approach—for 20 workers and tasks, that’s the difference between 8,000 operations and 2.4 quintillion.
The algorithm works by manipulating a cost matrix through row/column reduction and zero covering until an optimal assignment emerges where each worker maps to exactly one task at minimum total cost.
Real-world applications extend far beyond simple job scheduling—load balancing in distributed systems, matching in recommendation engines, and resource allocation in cloud infrastructure all rely on efficient assignment algorithms.

Introduction to the Assignment Problem

You have five developers and five features to build. Each developer has different skills, so the time to complete each feature varies by who’s assigned to it. Your goal: assign each developer to exactly one feature such that total development time is minimized.

This is the assignment problem. Formally, given an n×n cost matrix where entry C[i][j] represents the cost of assigning worker i to task j, find a one-to-one mapping that minimizes total cost.

The problem appears everywhere in software systems. Container orchestration platforms solve assignment problems when scheduling pods to nodes. Ride-sharing services match drivers to passengers. Content delivery networks route requests to edge servers. Any time you’re pairing resources with consumers and want to optimize some metric, you’re likely facing an assignment problem.

Why Not Brute Force?

The naive approach is straightforward: enumerate all possible assignments, calculate each total cost, and pick the minimum. For n workers and n tasks, there are n! possible assignments.

import itertools

def brute_force_assignment(cost_matrix):
    n = len(cost_matrix)
    min_cost = float('inf')
    best_assignment = None
    
    for perm in itertools.permutations(range(n)):
        cost = sum(cost_matrix[i][perm[i]] for i in range(n))
        if cost < min_cost:
            min_cost = cost
            best_assignment = perm
    
    return best_assignment, min_cost

This works fine for n=5 (120 permutations). At n=10, you’re evaluating 3.6 million assignments. At n=15, it’s 1.3 trillion. At n=20, you’re looking at 2.4×10¹⁸ permutations—even at a billion evaluations per second, you’d wait 76 years.

The Hungarian Algorithm, developed by Harold Kuhn in 1955 based on earlier work by Hungarian mathematicians Dénes Kőnig and Jenő Egerváry, solves this in O(n³) time. That’s 8,000 operations for n=20.

The Hungarian Algorithm: Core Concepts

The algorithm operates on a cost matrix and exploits a key insight: subtracting a constant from any row or column doesn’t change which assignment is optimal—it only shifts all total costs by that constant.

The goal is to transform the matrix until we can find n zeros such that no two are in the same row or column. These zeros represent the optimal assignment because we’ve reduced costs to the point where the best choices cost nothing relative to other options in their rows and columns.

The algorithm maintains this invariant: the minimum total cost in the reduced matrix plus all the values we’ve subtracted equals the minimum total cost in the original matrix.

Think of it as a bipartite matching problem. Workers form one set of vertices, tasks form another, and edges connect every worker to every task with weights from the cost matrix. We’re finding a perfect matching with minimum total weight.

Step-by-Step Algorithm Walkthrough

The algorithm proceeds in four main phases, repeated until an optimal assignment is found:

Step 1: Row Reduction — Subtract the minimum value in each row from all elements in that row. This guarantees at least one zero per row.

Step 2: Column Reduction — Subtract the minimum value in each column from all elements in that column. Now we have at least one zero per column too.

Step 3: Cover Zeros with Minimum Lines — Draw the minimum number of horizontal and vertical lines needed to cover all zeros. If you need n lines, you’ve found an optimal assignment. If fewer, proceed to step 4.

Step 4: Adjust the Matrix — Find the smallest uncovered value. Subtract it from all uncovered elements and add it to elements covered by two lines (intersections). Return to step 3.

Here’s a complete implementation:

import numpy as np

def hungarian_algorithm(cost_matrix):
    """
    Solve the assignment problem using the Hungarian Algorithm.
    Returns (assignment, total_cost) where assignment[i] = j means
    worker i is assigned to task j.
    """
    cost = np.array(cost_matrix, dtype=float)
    n = cost.shape[0]
    
    # Step 1: Row reduction
    cost -= cost.min(axis=1, keepdims=True)
    
    # Step 2: Column reduction
    cost -= cost.min(axis=0, keepdims=True)
    
    while True:
        # Step 3: Find minimum lines to cover zeros
        row_covered, col_covered = cover_zeros(cost)
        
        if sum(row_covered) + sum(col_covered) >= n:
            break
        
        # Step 4: Adjust matrix
        uncovered_min = np.min(cost[~row_covered][:, ~col_covered])
        cost[~row_covered] -= uncovered_min
        cost[:, col_covered] += uncovered_min
        cost[row_covered] -= 0  # No change, but explicit
        cost[row_covered][:, col_covered] += uncovered_min
    
    # Extract optimal assignment
    assignment = extract_assignment(cost)
    original_cost = np.array(cost_matrix)
    total = sum(original_cost[i, assignment[i]] for i in range(n))
    
    return assignment, total

Implementation Deep Dive

The tricky parts are covering zeros optimally and extracting the final assignment. Let’s implement these helper functions:

def cover_zeros(cost):
    """
    Find minimum lines to cover all zeros using a greedy approach.
    Returns boolean arrays for covered rows and columns.
    """
    n = cost.shape[0]
    row_covered = np.zeros(n, dtype=bool)
    col_covered = np.zeros(n, dtype=bool)
    
    # Mark zeros and track assignments
    zero_mask = (cost == 0)
    
    # Greedy covering: repeatedly cover the line with most uncovered zeros
    while True:
        uncovered_zeros = zero_mask & ~row_covered[:, None] & ~col_covered
        if not uncovered_zeros.any():
            break
        
        row_zero_count = uncovered_zeros.sum(axis=1)
        col_zero_count = uncovered_zeros.sum(axis=0)
        
        max_row = row_zero_count.max()
        max_col = col_zero_count.max()
        
        if max_row >= max_col:
            row_covered[row_zero_count.argmax()] = True
        else:
            col_covered[col_zero_count.argmax()] = True
    
    return row_covered, col_covered


def extract_assignment(cost):
    """
    Extract the optimal assignment from a fully reduced matrix.
    Uses a simple greedy approach with backtracking.
    """
    n = cost.shape[0]
    assignment = [-1] * n
    col_assigned = [False] * n
    
    def backtrack(row):
        if row == n:
            return True
        
        # Find zeros in this row, sorted by column zero count (heuristic)
        zero_cols = [j for j in range(n) if cost[row, j] == 0 and not col_assigned[j]]
        
        for col in zero_cols:
            assignment[row] = col
            col_assigned[col] = True
            
            if backtrack(row + 1):
                return True
            
            col_assigned[col] = False
        
        return False
    
    backtrack(0)
    return assignment

Let’s test with a concrete example:

cost_matrix = [
    [82, 83, 69, 92],
    [77, 37, 49, 92],
    [11, 69, 5, 86],
    [8, 9, 98, 23]
]

assignment, total = hungarian_algorithm(cost_matrix)
print(f"Assignment: {assignment}")  # [2, 1, 0, 3] or equivalent
print(f"Total cost: {total}")       # 140

Complexity Analysis and Optimizations

The algorithm runs in O(n³) time. Each iteration of the main loop reduces the number of uncovered zeros or increases the number of covering lines. Since we can have at most n lines and each adjustment takes O(n²) work, the overall complexity is O(n³).

Space complexity is O(n²) for storing the cost matrix and O(n) for the covering arrays.

For production use, don’t implement this yourself. SciPy provides a highly optimized implementation:

from scipy.optimize import linear_sum_assignment
import numpy as np
import time

def benchmark(n, iterations=10):
    """Compare custom implementation vs scipy."""
    cost = np.random.randint(1, 100, size=(n, n))
    
    start = time.time()
    for _ in range(iterations):
        linear_sum_assignment(cost)
    scipy_time = (time.time() - start) / iterations
    
    print(f"n={n}: scipy={scipy_time*1000:.2f}ms")

benchmark(100)   # ~0.5ms
benchmark(500)   # ~15ms
benchmark(1000)  # ~80ms

For very large problems (n > 10,000), consider the Jonker-Volgenant algorithm, which has better practical performance despite the same theoretical complexity. The lap library provides this implementation.

Practical Applications and Extensions

Unbalanced Problems: When you have more workers than tasks (or vice versa), pad the matrix with dummy rows or columns filled with zeros:

def solve_unbalanced(cost_matrix):
    """
    Handle rectangular cost matrices by padding with zeros.
    """
    cost = np.array(cost_matrix)
    rows, cols = cost.shape
    
    if rows == cols:
        return linear_sum_assignment(cost)
    
    # Pad to make square
    size = max(rows, cols)
    padded = np.zeros((size, size))
    padded[:rows, :cols] = cost
    
    row_ind, col_ind = linear_sum_assignment(padded)
    
    # Filter out dummy assignments
    valid = [(r, c) for r, c in zip(row_ind, col_ind) if r < rows and c < cols]
    return zip(*valid)

# 3 workers, 5 tasks
cost = [
    [10, 5, 13, 4, 8],
    [3, 9, 18, 13, 6],
    [10, 7, 2, 9, 11]
]

rows, cols = solve_unbalanced(cost)
print(list(zip(rows, cols)))  # Worker-task pairs

Maximization: To maximize instead of minimize, negate the matrix or subtract all values from the maximum:

def maximize_assignment(profit_matrix):
    """Convert maximization to minimization."""
    profit = np.array(profit_matrix)
    cost = profit.max() - profit
    return linear_sum_assignment(cost)

The Hungarian Algorithm remains one of the most elegant solutions in combinatorial optimization. Understanding it gives you a powerful tool for resource allocation problems and insight into how polynomial-time algorithms can tame exponential search spaces.