Johnson's Algorithm: Sparse All-Pairs Shortest Path

Key Insights

Johnson’s algorithm combines Bellman-Ford and Dijkstra to solve all-pairs shortest path in O(V²logV + VE), making it dramatically faster than Floyd-Warshall for sparse graphs
The reweighting trick transforms negative edge weights into non-negative ones while preserving shortest path structure—this is the algorithm’s core innovation
Use Johnson’s when your graph is sparse (E « V²) and contains negative edges; otherwise, simpler alternatives exist

Introduction & Problem Context

The all-pairs shortest path (APSP) problem asks a straightforward question: given a weighted graph, what’s the shortest path between every pair of vertices? This comes up constantly in practice—network routing tables, geographic mapping systems, dependency analysis, and financial arbitrage detection all need this information.

For dense graphs, Floyd-Warshall’s O(V³) algorithm is the standard choice. It’s simple, cache-friendly, and hard to beat when you have roughly V² edges. But real-world graphs are often sparse. Social networks, road systems, and dependency graphs typically have edges proportional to V rather than V². Running Floyd-Warshall on a sparse graph with 10,000 vertices and 50,000 edges wastes enormous computation on non-existent edges.

Johnson’s algorithm, published by Donald Johnson in 1977, exploits sparsity. It achieves O(V²logV + VE) time complexity—a massive improvement when E is small relative to V².

Why Not Floyd-Warshall or Repeated Dijkstra?

The naive approach to APSP on sparse graphs seems obvious: run Dijkstra’s algorithm from each vertex. With a binary heap, each Dijkstra run takes O((V + E)logV), giving O(V(V + E)logV) = O(V²logV + VE·logV) total. That’s close to Johnson’s complexity but with an extra logV factor on the VE term.

More critically, Dijkstra’s algorithm fails with negative edge weights. It greedily finalizes vertices based on current shortest distances, but a negative edge later in the graph could provide a shorter path to an already-finalized vertex. The algorithm’s correctness proof breaks down entirely.

Floyd-Warshall handles negative edges fine (as long as there are no negative cycles), but its O(V³) complexity ignores graph structure. On a sparse graph with V = 10,000 and E = 50,000:

Floyd-Warshall: 10,000³ = 10¹² operations
Johnson’s: 10,000² × log(10,000) + 10,000 × 50,000 ≈ 1.3 × 10⁹ + 5 × 10⁸ ≈ 2 × 10⁹ operations

That’s a 500x speedup. Johnson’s algorithm gives you the best of both worlds: it handles negative edges and exploits sparsity.

The Reweighting Trick: Core Intuition

Johnson’s key insight is elegant: transform all edge weights to be non-negative, then safely use Dijkstra. The transformation must preserve shortest paths—if path A was shorter than path B before reweighting, it must remain shorter after.

The trick uses a potential function h(v) assigned to each vertex. We reweight each edge (u, v) with original weight w(u, v) to:

w'(u, v) = w(u, v) + h(u) - h(v)

Why does this preserve shortest paths? Consider any path from s to t passing through vertices s → v₁ → v₂ → … → vₖ → t. The reweighted path length is:

w'(s, v₁) + w'(v₁, v₂) + ... + w'(vₖ, t)
= [w(s, v₁) + h(s) - h(v₁)] + [w(v₁, v₂) + h(v₁) - h(v₂)] + ... + [w(vₖ, t) + h(vₖ) - h(t)]
= w(s, v₁) + w(v₁, v₂) + ... + w(vₖ, t) + h(s) - h(t)

The intermediate h values telescope and cancel. Every path from s to t gets the same constant h(s) - h(t) added to its length. Relative ordering is preserved.

Now we need h values that guarantee non-negative reweighted edges. Here’s where Bellman-Ford enters: if we set h(v) equal to the shortest path distance from some source to v, the triangle inequality guarantees:

h(v) ≤ h(u) + w(u, v)

Rearranging: w(u, v) + h(u) - h(v) ≥ 0. Exactly what we need.

def reweight_edges(graph, h):
    """
    Transform edge weights using potential function h.
    Returns new adjacency list with non-negative weights.
    """
    reweighted = {v: [] for v in graph}
    
    for u in graph:
        for v, weight in graph[u]:
            new_weight = weight + h[u] - h[v]
            reweighted[u].append((v, new_weight))
    
    return reweighted

Algorithm Walkthrough

Johnson’s algorithm proceeds in five distinct phases:

Add phantom source: Create a new vertex q connected to every original vertex with zero-weight edges
Run Bellman-Ford: Compute shortest paths from q to all vertices, giving us h values
Reweight edges: Transform all original edge weights using the potential function
Run Dijkstra from each vertex: Compute shortest paths on the reweighted graph
Adjust distances: Convert reweighted distances back to original weights

import heapq
from collections import defaultdict

def johnsons_algorithm(vertices, edges):
    """
    Compute all-pairs shortest paths using Johnson's algorithm.
    
    Args:
        vertices: List of vertex identifiers
        edges: List of (u, v, weight) tuples
    
    Returns:
        2D dictionary of shortest distances, or None if negative cycle exists
    """
    # Phase 1: Build graph and add phantom source 'q'
    graph = defaultdict(list)
    for u, v, w in edges:
        graph[u].append((v, w))
    
    q = '__phantom_source__'
    for v in vertices:
        graph[q].append((v, 0))
    
    all_vertices = vertices + [q]
    
    # Phase 2: Run Bellman-Ford from phantom source
    h = {v: float('inf') for v in all_vertices}
    h[q] = 0
    
    # Relax all edges V times
    for _ in range(len(all_vertices) - 1):
        for u in all_vertices:
            if h[u] == float('inf'):
                continue
            for v, w in graph[u]:
                if h[u] + w < h[v]:
                    h[v] = h[u] + w
    
    # Check for negative cycles (Phase 2 continued)
    for u in all_vertices:
        if h[u] == float('inf'):
            continue
        for v, w in graph[u]:
            if h[u] + w < h[v]:
                return None  # Negative cycle detected
    
    # Phase 3: Reweight all original edges
    reweighted = defaultdict(list)
    for u, v, w in edges:
        new_weight = w + h[u] - h[v]
        reweighted[u].append((v, new_weight))
    
    # Phase 4: Run Dijkstra from each original vertex
    def dijkstra(source):
        dist = {v: float('inf') for v in vertices}
        dist[source] = 0
        pq = [(0, source)]
        
        while pq:
            d, u = heapq.heappop(pq)
            if d > dist[u]:
                continue
            for v, w in reweighted[u]:
                if dist[u] + w < dist[v]:
                    dist[v] = dist[u] + w
                    heapq.heappush(pq, (dist[v], v))
        
        return dist
    
    reweighted_distances = {}
    for source in vertices:
        reweighted_distances[source] = dijkstra(source)
    
    # Phase 5: Adjust distances back to original weights
    result = defaultdict(dict)
    for u in vertices:
        for v in vertices:
            if reweighted_distances[u][v] == float('inf'):
                result[u][v] = float('inf')
            else:
                # Reverse the reweighting transformation
                result[u][v] = reweighted_distances[u][v] - h[u] + h[v]
    
    return result

Handling Negative Cycles

A negative cycle makes shortest paths undefined—you can always go around the cycle one more time to get a shorter path. Johnson’s algorithm detects this during the Bellman-Ford phase.

Bellman-Ford relaxes all edges V-1 times, which is sufficient for shortest paths in a graph without negative cycles (any shortest path has at most V-1 edges). If a V-th relaxation pass still finds improvements, a negative cycle must exist.

def detect_negative_cycle(vertices, edges):
    """
    Check if graph contains a negative cycle.
    Returns True if negative cycle exists.
    """
    # Add phantom source connected to all vertices
    dist = {v: float('inf') for v in vertices}
    phantom = '__source__'
    dist[phantom] = 0
    
    all_edges = edges + [(phantom, v, 0) for v in vertices]
    all_vertices = vertices + [phantom]
    
    # Relax V-1 times
    for _ in range(len(all_vertices) - 1):
        for u, v, w in all_edges:
            if dist[u] != float('inf') and dist[u] + w < dist[v]:
                dist[v] = dist[u] + w
    
    # Check for further improvements (indicates negative cycle)
    for u, v, w in all_edges:
        if dist[u] != float('inf') and dist[u] + w < dist[v]:
            return True
    
    return False

When your algorithm detects a negative cycle, return an appropriate sentinel value or raise an exception. Don’t silently return incorrect results.

Complexity Analysis & When to Use It

Breaking down Johnson’s time complexity by phase:

Bellman-Ford: O(VE) — relaxes E edges, V times
Reweighting: O(E) — single pass over all edges
V Dijkstra runs: O(V(V + E)logV) with binary heap, simplifies to O(V²logV + VE·logV)

The total is O(VE + V²logV + VE·logV). For sparse graphs where E = O(V), this becomes O(V²logV). Using a Fibonacci heap for Dijkstra reduces the V Dijkstra runs to O(V²logV + VE), giving overall O(VE + V²logV).

Space complexity is O(V²) for storing the output distance matrix, plus O(V + E) for the graph representation.

Decision guide:

Condition	Best Algorithm
Dense graph (E ≈ V²), any weights	Floyd-Warshall
Sparse graph, non-negative weights	V × Dijkstra
Sparse graph, negative weights	Johnson’s
Single-source, negative weights	Bellman-Ford

Practical Considerations

Graph representation matters. Johnson’s algorithm requires iterating over outgoing edges from each vertex. Use adjacency lists, not adjacency matrices. An adjacency matrix turns the O(VE) Bellman-Ford phase into O(V³), eliminating the sparsity advantage entirely.

Priority queue selection affects constants. Binary heaps are simple and cache-friendly. Fibonacci heaps have better theoretical complexity but higher constant factors and implementation complexity. For most practical graph sizes, binary heaps win.

Real-world applications include currency arbitrage detection (negative cycles represent profitable trading loops), network routing with link costs that can be negative (representing subsidies or incentives), and game AI pathfinding in graphs with both rewards and penalties.

# Example: Finding arbitrage opportunities in currency exchange
vertices = ['USD', 'EUR', 'GBP', 'JPY']
# Edge weights are -log(exchange_rate) to convert multiplication to addition
edges = [
    ('USD', 'EUR', -0.08),  # 1 USD = 1.08 EUR
    ('EUR', 'GBP', 0.12),   # 1 EUR = 0.89 GBP
    ('GBP', 'USD', -0.31),  # 1 GBP = 1.36 USD
    ('USD', 'JPY', -4.70),  # 1 USD = 110 JPY
    ('JPY', 'EUR', 4.62),   # 110 JPY ≈ 1 EUR
]

result = johnsons_algorithm(vertices, edges)

if result is None:
    print("Arbitrage opportunity detected!")
else:
    for src in vertices:
        for dst in vertices:
            print(f"{src} -> {dst}: {result[src][dst]:.4f}")

Johnson’s algorithm is one of those algorithms that elegantly combines simpler building blocks—Bellman-Ford and Dijkstra—to solve a harder problem. Understanding why the reweighting trick works gives you insight into potential functions that appears throughout algorithm design, from network flows to amortized analysis.