Graph Representations: Edge List, Adjacency List, Adjacency Matrix
The way you store a graph determines everything about your algorithm's performance. Choose wrong, and you'll burn through memory on sparse graphs or grind through slow lookups on dense ones. I've...
Key Insights
- Edge lists excel for edge-centric algorithms like Kruskal’s MST, but suffer O(E) lookup times that make them impractical for traversal-heavy workloads.
- Adjacency lists offer the best balance for most real-world graphs, providing O(degree) neighbor access with O(V + E) space—the default choice for sparse graphs like social networks.
- Adjacency matrices trade O(V²) space for O(1) edge lookups, making them ideal only for dense graphs or when constant-time edge queries are critical.
Why Representation Choice Matters
The way you store a graph determines everything about your algorithm’s performance. Choose wrong, and you’ll burn through memory on sparse graphs or grind through slow lookups on dense ones. I’ve seen production systems crawl because someone defaulted to an adjacency matrix for a social graph with millions of nodes and average degree of 50.
Graph algorithms don’t exist in isolation—they run on data structures. A BFS that’s theoretically O(V + E) becomes O(V²) if your representation forces you to scan every possible edge. Understanding these trade-offs isn’t academic; it’s the difference between code that scales and code that falls over.
We’ll examine three fundamental representations: edge lists, adjacency lists, and adjacency matrices. Each has a place, and knowing when to use which separates competent engineers from those who copy-paste from tutorials.
Edge List Representation
An edge list is exactly what it sounds like: a flat list of edges. Each edge is a tuple containing source vertex, destination vertex, and optionally a weight. Nothing more.
from dataclasses import dataclass
from typing import List, Optional, Tuple
@dataclass
class Edge:
source: int
destination: int
weight: float = 1.0
class EdgeListGraph:
def __init__(self, num_vertices: int, directed: bool = False):
self.num_vertices = num_vertices
self.directed = directed
self.edges: List[Edge] = []
def add_edge(self, source: int, dest: int, weight: float = 1.0) -> None:
self.edges.append(Edge(source, dest, weight))
# For undirected graphs, we only store once but handle in queries
def has_edge(self, source: int, dest: int) -> bool:
for edge in self.edges:
if edge.source == source and edge.destination == dest:
return True
if not self.directed:
if edge.source == dest and edge.destination == source:
return True
return False
def get_neighbors(self, vertex: int) -> List[int]:
neighbors = []
for edge in self.edges:
if edge.source == vertex:
neighbors.append(edge.destination)
elif not self.directed and edge.destination == vertex:
neighbors.append(edge.source)
return neighbors
def get_edges_sorted_by_weight(self) -> List[Edge]:
return sorted(self.edges, key=lambda e: e.weight)
Edge lists shine in one specific scenario: algorithms that process edges in sorted order. Kruskal’s minimum spanning tree algorithm needs to iterate through edges by weight—an edge list with a single sort gives you exactly that. They’re also the natural format for graph input files and streaming edge data.
The downsides are brutal. Checking if an edge exists requires scanning the entire list: O(E). Finding neighbors? Also O(E). For any traversal algorithm, you’re looking at O(V × E) just to visit every vertex and check its neighbors. Don’t use edge lists for BFS, DFS, or Dijkstra’s algorithm.
Adjacency List Representation
Adjacency lists store, for each vertex, a collection of its neighbors. Implementation typically uses a dictionary mapping vertices to lists (or sets) of adjacent vertices.
from collections import defaultdict, deque
from typing import Dict, List, Set, Optional
class AdjacencyListGraph:
def __init__(self, directed: bool = False):
self.directed = directed
self.adj: Dict[int, List[Tuple[int, float]]] = defaultdict(list)
self._vertices: Set[int] = set()
def add_vertex(self, vertex: int) -> None:
self._vertices.add(vertex)
def add_edge(self, source: int, dest: int, weight: float = 1.0) -> None:
self._vertices.add(source)
self._vertices.add(dest)
self.adj[source].append((dest, weight))
if not self.directed:
self.adj[dest].append((source, weight))
def has_edge(self, source: int, dest: int) -> bool:
return any(neighbor == dest for neighbor, _ in self.adj[source])
def get_neighbors(self, vertex: int) -> List[int]:
return [neighbor for neighbor, _ in self.adj[vertex]]
def num_vertices(self) -> int:
return len(self._vertices)
def num_edges(self) -> int:
total = sum(len(neighbors) for neighbors in self.adj.values())
return total if self.directed else total // 2
def bfs(self, start: int) -> List[int]:
visited = set()
order = []
queue = deque([start])
visited.add(start)
while queue:
vertex = queue.popleft()
order.append(vertex)
for neighbor, _ in self.adj[vertex]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
return order
def dfs(self, start: int) -> List[int]:
visited = set()
order = []
def dfs_recursive(vertex: int) -> None:
visited.add(vertex)
order.append(vertex)
for neighbor, _ in self.adj[vertex]:
if neighbor not in visited:
dfs_recursive(neighbor)
dfs_recursive(start)
return order
This is the workhorse representation. Space complexity is O(V + E)—you store each vertex once and each edge once (or twice for undirected graphs). Getting all neighbors of a vertex takes O(degree) time, which is optimal since you need to at least read the output.
BFS and DFS run in their theoretical O(V + E) time because iterating through all neighbors across all vertices touches each edge exactly once. Dijkstra’s algorithm, topological sort, cycle detection—all graph traversal algorithms perform optimally with adjacency lists.
The weakness is edge existence checking. To determine if edge (u, v) exists, you must scan u’s neighbor list: O(degree(u)). For high-degree vertices, this adds up. You can mitigate this by using sets instead of lists for neighbors, trading some memory for O(1) average-case lookups.
Adjacency Matrix Representation
An adjacency matrix uses a 2D array where position [i][j] indicates whether an edge exists from vertex i to vertex j. For weighted graphs, store the weight; for unweighted, use boolean or 0/1.
from typing import List, Optional
import numpy as np
class AdjacencyMatrixGraph:
def __init__(self, num_vertices: int, directed: bool = False):
self.num_vertices = num_vertices
self.directed = directed
# Using infinity for no edge, actual weight for edges
self.matrix = np.full((num_vertices, num_vertices), np.inf)
np.fill_diagonal(self.matrix, 0) # Distance to self is 0
def add_edge(self, source: int, dest: int, weight: float = 1.0) -> None:
self.matrix[source][dest] = weight
if not self.directed:
self.matrix[dest][source] = weight
def remove_edge(self, source: int, dest: int) -> None:
self.matrix[source][dest] = np.inf
if not self.directed:
self.matrix[dest][source] = np.inf
def has_edge(self, source: int, dest: int) -> bool:
return self.matrix[source][dest] != np.inf
def get_weight(self, source: int, dest: int) -> float:
return self.matrix[source][dest]
def get_neighbors(self, vertex: int) -> List[int]:
neighbors = []
for i in range(self.num_vertices):
if i != vertex and self.matrix[vertex][i] != np.inf:
neighbors.append(i)
return neighbors
def num_edges(self) -> int:
count = np.sum(self.matrix != np.inf) - self.num_vertices # Exclude diagonal
return int(count) if self.directed else int(count) // 2
The adjacency matrix gives you O(1) edge lookups and updates. Need to check if vertices 47 and 892 are connected? One array access. This makes certain algorithms faster—Floyd-Warshall for all-pairs shortest paths operates directly on the matrix with clean, cache-friendly access patterns.
The cost is space: O(V²) regardless of edge count. A graph with 100,000 vertices requires 10 billion entries. Even with 1-byte entries, that’s 10GB. If your graph is sparse—and most real-world graphs are—you’re wasting massive amounts of memory storing zeros.
Getting neighbors also degrades to O(V) since you must scan an entire row to find non-zero entries.
Comparative Analysis
| Operation | Edge List | Adjacency List | Adjacency Matrix |
|---|---|---|---|
| Space | O(E) | O(V + E) | O(V²) |
| Add Edge | O(1) | O(1) | O(1) |
| Remove Edge | O(E) | O(degree) | O(1) |
| Check Edge | O(E) | O(degree) | O(1) |
| Get Neighbors | O(E) | O(degree) | O(V) |
| Iterate All Edges | O(E) | O(V + E) | O(V²) |
Here’s a benchmark to make these differences concrete:
import time
import random
from typing import Callable
def benchmark_operation(name: str, operation: Callable, iterations: int = 1000) -> float:
start = time.perf_counter()
for _ in range(iterations):
operation()
elapsed = time.perf_counter() - start
return elapsed / iterations * 1000 # Return ms per operation
def compare_representations(num_vertices: int, num_edges: int):
print(f"\nBenchmark: {num_vertices} vertices, {num_edges} edges")
print(f"Density: {num_edges / (num_vertices * (num_vertices - 1)):.4f}")
print("-" * 60)
# Build all three representations
edges = [(random.randint(0, num_vertices-1),
random.randint(0, num_vertices-1))
for _ in range(num_edges)]
edge_list = EdgeListGraph(num_vertices)
adj_list = AdjacencyListGraph()
adj_matrix = AdjacencyMatrixGraph(num_vertices)
for src, dst in edges:
edge_list.add_edge(src, dst)
adj_list.add_edge(src, dst)
adj_matrix.add_edge(src, dst)
# Benchmark edge checking
test_pairs = [(random.randint(0, num_vertices-1),
random.randint(0, num_vertices-1)) for _ in range(100)]
print("Edge check (ms per 1000 ops):")
print(f" Edge List: {benchmark_operation('', lambda: edge_list.has_edge(*random.choice(test_pairs))):.4f}")
print(f" Adj List: {benchmark_operation('', lambda: adj_list.has_edge(*random.choice(test_pairs))):.4f}")
print(f" Adj Matrix: {benchmark_operation('', lambda: adj_matrix.has_edge(*random.choice(test_pairs))):.4f}")
# Run comparisons
compare_representations(1000, 5000) # Sparse
compare_representations(1000, 100000) # Dense
Choosing the Right Representation
Use this decision framework:
Choose Edge List when:
- You’re implementing Kruskal’s algorithm or need edges sorted by weight
- The graph is read once from input and processed edge-by-edge
- Memory is extremely constrained and the graph is very sparse
Choose Adjacency List when:
- You’re implementing BFS, DFS, Dijkstra, or any traversal algorithm
- The graph is sparse (E « V²)
- You need to frequently iterate over a vertex’s neighbors
- This is your default choice for most applications
Choose Adjacency Matrix when:
- The graph is dense (E approaches V²)
- You need O(1) edge existence checks as the primary operation
- You’re implementing Floyd-Warshall or matrix-based algorithms
- V is small enough that V² fits comfortably in memory
Converting between representations is straightforward:
def edge_list_to_adj_list(edge_graph: EdgeListGraph) -> AdjacencyListGraph:
adj = AdjacencyListGraph(directed=edge_graph.directed)
for edge in edge_graph.edges:
adj.add_edge(edge.source, edge.destination, edge.weight)
return adj
def adj_list_to_matrix(adj_graph: AdjacencyListGraph,
num_vertices: int) -> AdjacencyMatrixGraph:
matrix = AdjacencyMatrixGraph(num_vertices, directed=adj_graph.directed)
for vertex in range(num_vertices):
for neighbor, weight in adj_graph.adj[vertex]:
matrix.matrix[vertex][neighbor] = weight
return matrix
def matrix_to_edge_list(matrix_graph: AdjacencyMatrixGraph) -> EdgeListGraph:
edge = EdgeListGraph(matrix_graph.num_vertices,
directed=matrix_graph.directed)
for i in range(matrix_graph.num_vertices):
start = 0 if matrix_graph.directed else i + 1
for j in range(start, matrix_graph.num_vertices):
if matrix_graph.has_edge(i, j):
edge.add_edge(i, j, matrix_graph.get_weight(i, j))
return edge
Conclusion
Graph representation isn’t a one-size-fits-all decision. Edge lists work for edge-centric batch processing. Adjacency lists handle the vast majority of graph algorithms efficiently. Adjacency matrices pay their space cost only when density or constant-time lookups justify it.
Quick Reference:
- Sparse graph + traversal → Adjacency List
- Dense graph + edge queries → Adjacency Matrix
- Edge sorting + batch processing → Edge List
- Unsure → Adjacency List (it’s rarely wrong)
Start with adjacency lists. Profile your actual workload. Switch representations only when you have evidence that another choice performs better for your specific access patterns.