Edmonds-Karp Algorithm: BFS-Based Max Flow

Key Insights

Edmonds-Karp guarantees O(VE²) time complexity by using BFS to find the shortest augmenting path, eliminating the pathological cases that plague DFS-based Ford-Fulkerson
The residual graph is the core data structure—track both forward capacity and backward flow to enable flow “undoing” when better paths emerge
Most max-flow interview problems and practical applications (bipartite matching, network routing, image segmentation) reduce cleanly to Edmonds-Karp with minimal modification

Introduction to Maximum Flow Problems

Flow networks model systems where something moves from a source to a sink through a network of edges with capacity constraints. Think of water pipes, network packets, or goods through a supply chain. Each edge has a maximum capacity, and we want to push as much “flow” as possible from source to sink without exceeding any edge’s limit.

The formal setup: a directed graph G = (V, E) with a source node s, sink node t, and capacity function c(u, v) for each edge. We seek a flow function f(u, v) that maximizes total flow while respecting three constraints: capacity limits (flow ≤ capacity), flow conservation (inflow equals outflow at intermediate nodes), and skew symmetry (f(u, v) = -f(v, u)).

Real-world applications are everywhere. Network administrators use max flow to determine bandwidth bottlenecks. Logistics companies optimize shipping routes. Dating apps solve bipartite matching (users to potential matches) as a max-flow problem. Image segmentation algorithms use min-cut/max-flow duality to separate foreground from background.

From Ford-Fulkerson to Edmonds-Karp

Ford-Fulkerson is the foundational max-flow algorithm: repeatedly find any path from source to sink with available capacity, push flow along it, and update the residual graph. The problem? “Any path” is dangerously vague.

With DFS-based path finding, Ford-Fulkerson can exhibit terrible performance. Consider a graph where two parallel paths share a cross-edge with capacity 1, while the main paths have capacity 1,000,000. DFS might alternate between paths that include the cross-edge, incrementing flow by 1 each iteration—requiring 2,000,000 iterations instead of 2.

Edmonds-Karp fixes this by mandating BFS for path finding. By always selecting the shortest augmenting path (fewest edges), we guarantee O(VE²) complexity regardless of capacity values. Each edge becomes saturated at most O(V) times, and we have O(E) edges, with each BFS taking O(E) time.

Here’s our flow network representation:

from collections import defaultdict, deque
from typing import Dict, List, Tuple, Optional

class FlowNetwork:
    def __init__(self, vertices: int):
        self.V = vertices
        # Adjacency list: graph[u] contains all neighbors v
        self.graph: Dict[int, List[int]] = defaultdict(list)
        # Capacity matrix: capacity[u][v] = remaining capacity
        self.capacity: Dict[int, Dict[int, int]] = defaultdict(lambda: defaultdict(int))
    
    def add_edge(self, u: int, v: int, cap: int) -> None:
        """Add directed edge from u to v with given capacity."""
        self.graph[u].append(v)
        self.graph[v].append(u)  # Reverse edge for residual graph
        self.capacity[u][v] += cap  # Handle parallel edges
        # Note: capacity[v][u] starts at 0 (reverse edge)

We use an adjacency list for graph traversal combined with a capacity dictionary for O(1) lookups. The key insight: we add reverse edges immediately. These start with zero capacity but will hold “flow credit” that lets us undo previous decisions.

Core Algorithm Walkthrough

Edmonds-Karp repeats three steps until no augmenting path exists:

BFS from source to sink: Find the shortest path where every edge has positive residual capacity
Calculate bottleneck: The minimum capacity along the path determines how much flow we can push
Update residual graph: Decrease forward edge capacities, increase reverse edge capacities

The reverse edge update is crucial. If we push 5 units along edge (u, v), we reduce capacity[u][v] by 5 and increase capacity[v][u] by 5. This doesn’t mean flow actually goes backward—it means future paths can “cancel” this flow if a better routing exists.

def bfs_find_path(self, source: int, sink: int) -> Optional[Dict[int, int]]:
    """
    Find shortest augmenting path using BFS.
    Returns parent pointer dict if path exists, None otherwise.
    """
    parent: Dict[int, int] = {source: -1}
    visited = {source}
    queue = deque([source])
    
    while queue:
        u = queue.popleft()
        
        for v in self.graph[u]:
            # Only traverse edges with remaining capacity
            if v not in visited and self.capacity[u][v] > 0:
                visited.add(v)
                parent[v] = u
                
                if v == sink:
                    return parent  # Early termination
                
                queue.append(v)
    
    return None  # No path to sink

The BFS returns parent pointers rather than the path itself. This lets us reconstruct the path while simultaneously finding the bottleneck capacity in a single backward pass.

Implementation Deep Dive

Here’s the complete Edmonds-Karp implementation with detailed comments:

class EdmondsKarp(FlowNetwork):
    def max_flow(self, source: int, sink: int) -> int:
        """
        Compute maximum flow from source to sink.
        Returns total flow value.
        """
        total_flow = 0
        
        while True:
            # Step 1: Find shortest augmenting path via BFS
            parent = self.bfs_find_path(source, sink)
            
            if parent is None:
                break  # No more augmenting paths
            
            # Step 2: Find bottleneck capacity along the path
            path_flow = float('inf')
            v = sink
            while v != source:
                u = parent[v]
                path_flow = min(path_flow, self.capacity[u][v])
                v = u
            
            # Step 3: Update residual capacities along the path
            v = sink
            while v != source:
                u = parent[v]
                self.capacity[u][v] -= path_flow  # Reduce forward capacity
                self.capacity[v][u] += path_flow  # Increase reverse capacity
                v = u
            
            total_flow += path_flow
        
        return total_flow
    
    def get_flow_on_edge(self, u: int, v: int, original_cap: int) -> int:
        """Calculate actual flow on original edge (u, v)."""
        # Flow = original capacity - remaining capacity
        return original_cap - self.capacity[u][v]

Edge cases to handle: parallel edges (sum their capacities), self-loops (ignore them), disconnected sinks (return 0), and integer overflow with large capacities (use appropriate integer types).

Worked Example with Visualization

Let’s trace through a concrete example. Consider this network:

    Source (0)
      /    \
    10      10
    /        \
  (1)        (2)
    \   2   /
    10  →  10
      \    /
       (3)
        |
       10
        |
    Sink (4)

def trace_edmonds_karp():
    """Demonstrate algorithm with step-by-step output."""
    ek = EdmondsKarp(5)
    
    # Build the network
    edges = [(0, 1, 10), (0, 2, 10), (1, 2, 2), 
             (1, 3, 10), (2, 3, 10), (3, 4, 10)]
    for u, v, cap in edges:
        ek.add_edge(u, v, cap)
    
    iteration = 0
    total = 0
    
    while True:
        parent = ek.bfs_find_path(0, 4)
        if not parent:
            break
        
        # Reconstruct path
        path = []
        v = 4
        while v != 0:
            path.append(v)
            v = parent[v]
        path.append(0)
        path.reverse()
        
        # Find bottleneck
        flow = float('inf')
        for i in range(len(path) - 1):
            flow = min(flow, ek.capacity[path[i]][path[i+1]])
        
        iteration += 1
        print(f"Iteration {iteration}: Path {' → '.join(map(str, path))}, Flow: {flow}")
        
        # Update (already done by max_flow, but showing manually)
        for i in range(len(path) - 1):
            ek.capacity[path[i]][path[i+1]] -= flow
            ek.capacity[path[i+1]][path[i]] += flow
        
        total += flow
    
    print(f"\nMaximum Flow: {total}")

# Output:
# Iteration 1: Path 0 → 1 → 3 → 4, Flow: 10
# Iteration 2: Path 0 → 2 → 3 → 4, Flow: 10
# Maximum Flow: 20

Notice BFS finds the shortest paths first. After iteration 1, edge (1,3) is saturated. The cross-edge (1,2) is never used because direct paths suffice.

Complexity Analysis and Optimizations

Proof sketch for O(VE²): Each BFS takes O(E) time. The key insight is that the shortest path length from source to any vertex never decreases as the algorithm progresses. Each edge can become critical (bottleneck) at most O(V) times before the path to it must grow longer. With O(E) edges and O(V) critical occurrences each, we have O(VE) augmenting paths, giving O(VE²) total.

When to use Edmonds-Karp:

Dense graphs where E ≈ V²: Edmonds-Karp’s O(VE²) = O(V⁵) is acceptable for small V
Simple implementation needs: It’s the easiest max-flow algorithm to code correctly
Interview settings: Expected solution for most max-flow problems

When to use alternatives:

Dinic’s algorithm: O(V²E) complexity, significantly faster on unit-capacity graphs O(E√V)
Push-relabel: Often faster in practice for very large graphs
Hopcroft-Karp: Purpose-built for bipartite matching, O(E√V)

Practical Applications

Bipartite matching reduces elegantly to max flow. Given workers and jobs with compatibility edges, add a super-source connected to all workers and a super-sink connected to all jobs. Each edge has capacity 1. Maximum flow equals maximum matching.

def bipartite_matching(workers: int, jobs: int, 
                       compatible: List[Tuple[int, int]]) -> List[Tuple[int, int]]:
    """
    Find maximum bipartite matching using Edmonds-Karp.
    compatible: list of (worker_id, job_id) pairs indicating compatibility
    Returns: list of (worker, job) matched pairs
    """
    # Node layout: 0 = source, 1..workers = workers, 
    #              workers+1..workers+jobs = jobs, workers+jobs+1 = sink
    source = 0
    sink = workers + jobs + 1
    
    ek = EdmondsKarp(sink + 1)
    
    # Connect source to all workers
    for w in range(1, workers + 1):
        ek.add_edge(source, w, 1)
    
    # Connect all jobs to sink
    for j in range(workers + 1, workers + jobs + 1):
        ek.add_edge(j, sink, 1)
    
    # Add compatibility edges (capacity 1)
    for worker, job in compatible:
        worker_node = worker + 1
        job_node = workers + job + 1
        ek.add_edge(worker_node, job_node, 1)
    
    # Store original capacities for flow reconstruction
    original_caps = {(w + 1, workers + j + 1): 1 
                     for w, j in compatible}
    
    max_matching = ek.max_flow(source, sink)
    
    # Reconstruct matching from residual graph
    matching = []
    for worker, job in compatible:
        w_node = worker + 1
        j_node = workers + job + 1
        # Edge is used if capacity dropped to 0
        if ek.capacity[w_node][j_node] == 0:
            matching.append((worker, job))
    
    return matching

# Example usage
workers, jobs = 3, 3
compatible = [(0, 0), (0, 1), (1, 0), (1, 2), (2, 1), (2, 2)]
result = bipartite_matching(workers, jobs, compatible)
print(f"Maximum matching: {result}")
# Output: Maximum matching: [(0, 1), (1, 0), (2, 2)]

This pattern—reducing a combinatorial problem to max flow—appears constantly. Assignment problems, project selection, image segmentation, baseball elimination, and airline scheduling all use this technique. Master Edmonds-Karp, and you’ve unlocked a powerful problem-solving primitive.