Skip List: Probabilistic Data Structure Implementation

Key Insights

Skip lists achieve O(log n) average-case operations through probabilistic balancing, eliminating the complex rotation logic required by AVL or Red-Black trees
The “coin flip” approach to level assignment creates a naturally balanced structure without any rebalancing operations, making skip lists excellent for concurrent implementations
Redis uses skip lists for sorted sets specifically because they outperform balanced trees when you need both ranked access and score-based lookups

Introduction to Skip Lists

Skip lists solve a fundamental problem: how do you get O(log n) search performance from a linked list? Regular linked lists require O(n) traversal, but skip lists add “express lanes” that let you skip over large portions of the data.

Invented by William Pugh in 1989, skip lists provide the same asymptotic performance as balanced binary search trees but with dramatically simpler implementation. Where a Red-Black tree requires careful color management and rotations, a skip list just flips coins.

This simplicity matters in production. Redis chose skip lists over balanced trees for their sorted set implementation. LevelDB and RocksDB use skip lists for their in-memory tables. The reason is practical: skip lists are easier to reason about, easier to debug, and naturally support concurrent access patterns.

How Skip Lists Work

Imagine a sorted linked list. To find an element, you traverse from the head, comparing each node until you find your target or overshoot. This is O(n).

Now imagine adding a second layer of links that skip every other node. You can traverse this express lane, dropping down to the regular lane only when needed. This roughly halves your comparisons.

Extend this concept: add more layers, each skipping more nodes than the one below. The top layer might have only a few nodes, letting you jump across huge portions of the list. When you can’t go further right without overshooting, drop down a level and continue.

The key insight is that you don’t need to carefully maintain which nodes appear at which levels. Instead, when inserting a node, you randomly decide its height by flipping a coin. Heads? Add another level. Keep flipping until you get tails. This geometric distribution naturally creates the right structure: about half the nodes at level 1, a quarter at level 2, an eighth at level 3, and so on.

Level 3:  HEAD ─────────────────────────────────────> 50 ──────────────> NIL
Level 2:  HEAD ──────────────> 25 ──────────────────> 50 ──────────────> NIL
Level 1:  HEAD ────> 10 ────> 25 ────> 30 ────> 42 ─> 50 ────> 67 ────> NIL
Level 0:  HEAD -> 5 -> 10 -> 25 -> 30 -> 42 -> 50 -> 55 -> 67 -> 80 -> NIL

Core Data Structures

The implementation requires two classes: a node that holds a value and forward pointers, and the skip list itself that manages the structure.

import random
from typing import Optional, List

class SkipListNode:
    def __init__(self, value: int, level: int):
        self.value = value
        # forward[i] points to the next node at level i
        self.forward: List[Optional['SkipListNode']] = [None] * (level + 1)
    
    def __repr__(self):
        return f"Node({self.value}, levels={len(self.forward)})"


class SkipList:
    def __init__(self, max_level: int = 16, probability: float = 0.5):
        self.max_level = max_level
        self.probability = probability
        self.level = 0  # current highest level in use
        # head is a sentinel node with max possible levels
        self.head = SkipListNode(-1, max_level)
    
    def random_level(self) -> int:
        """Generate a random level using geometric distribution."""
        level = 0
        while random.random() < self.probability and level < self.max_level:
            level += 1
        return level

The max_level parameter caps the height of the structure. For n elements, you want max_level ≈ log₂(n). Setting it to 16 handles up to 65,536 elements efficiently; 32 handles billions.

The probability parameter (typically 0.5) controls the density of upper levels. Lower values create sparser upper levels, trading slightly worse search time for less memory.

Search Operation

Searching follows a simple pattern: start at the top level of the head node, move right while the next node’s value is less than your target, then drop down a level. Repeat until you find the value or reach the bottom.

def search(self, target: int) -> bool:
    """Search for a value in the skip list. Returns True if found."""
    current = self.head
    
    # Start from the highest level and work down
    for level in range(self.level, -1, -1):
        # Move right while next node exists and has smaller value
        while (current.forward[level] is not None and 
               current.forward[level].value < target):
            current = current.forward[level]
    
    # Move to the candidate node at level 0
    current = current.forward[0]
    
    return current is not None and current.value == target

The beauty of this algorithm is its simplicity. Each level roughly halves the remaining search space, giving O(log n) expected time. The worst case is O(n) if every node randomly gets assigned to level 0, but this is astronomically unlikely.

Insertion with Randomized Levels

Insertion requires finding where the new node belongs and updating forward pointers at each level. We track the “update” array—the rightmost node at each level that will need its forward pointer changed.

def insert(self, value: int) -> None:
    """Insert a value into the skip list."""
    # Track nodes that need pointer updates at each level
    update = [None] * (self.max_level + 1)
    current = self.head
    
    # Find insertion position, tracking update nodes
    for level in range(self.level, -1, -1):
        while (current.forward[level] is not None and 
               current.forward[level].value < value):
            current = current.forward[level]
        update[level] = current
    
    # Move to insertion point
    current = current.forward[0]
    
    # If value already exists, we could update or skip
    # Here we allow duplicates by always inserting
    
    # Generate random level for new node
    new_level = self.random_level()
    
    # If new level exceeds current max, update tracking
    if new_level > self.level:
        for level in range(self.level + 1, new_level + 1):
            update[level] = self.head
        self.level = new_level
    
    # Create new node and rewire pointers
    new_node = SkipListNode(value, new_level)
    for level in range(new_level + 1):
        new_node.forward[level] = update[level].forward[level]
        update[level].forward[level] = new_node

The random_level() function is the probabilistic heart of skip lists. Each call simulates coin flips, and the geometric distribution naturally creates the multi-level structure that enables fast search.

Deletion Operation

Deletion mirrors insertion: find the node, then rewire pointers at each level where the node appears.

def delete(self, value: int) -> bool:
    """Delete a value from the skip list. Returns True if found and deleted."""
    update = [None] * (self.max_level + 1)
    current = self.head
    
    # Find the node and track update pointers
    for level in range(self.level, -1, -1):
        while (current.forward[level] is not None and 
               current.forward[level].value < value):
            current = current.forward[level]
        update[level] = current
    
    current = current.forward[0]
    
    # If node doesn't exist, nothing to delete
    if current is None or current.value != value:
        return False
    
    # Rewire pointers at each level
    for level in range(self.level + 1):
        if update[level].forward[level] != current:
            break
        update[level].forward[level] = current.forward[level]
    
    # Reduce level if top levels are now empty
    while self.level > 0 and self.head.forward[self.level] is None:
        self.level -= 1
    
    return True

The final loop reduces the skip list’s level when deletions empty the upper layers. This keeps the structure efficient as elements are removed.

Performance Analysis and Practical Considerations

Skip lists provide O(log n) expected time for search, insert, and delete operations. Space complexity is O(n) for the nodes themselves, plus O(n) expected for the forward pointers (each node has 2 pointers on average with p=0.5).

import time
from typing import Callable

def benchmark(name: str, operation: Callable, iterations: int = 10000):
    """Simple benchmark utility."""
    start = time.perf_counter()
    for i in range(iterations):
        operation(i)
    elapsed = time.perf_counter() - start
    print(f"{name}: {elapsed:.4f}s for {iterations} operations")

# Benchmark skip list
skip_list = SkipList()
benchmark("Skip List Insert", skip_list.insert)
benchmark("Skip List Search", skip_list.search)

# Compare with sorted list using bisect
import bisect
sorted_list = []
benchmark("Sorted List Insert", lambda x: bisect.insort(sorted_list, x))
benchmark("Sorted List Search", lambda x: bisect.bisect_left(sorted_list, x))

When should you choose skip lists over balanced trees?

Concurrency: Skip lists shine in concurrent scenarios. Lock-free skip list implementations exist and are simpler than lock-free tree implementations. You can lock individual nodes rather than subtrees.

Range queries: Finding all elements between two values is natural—just find the start and traverse forward at level 0.

Simplicity: If you need to implement a sorted data structure from scratch, skip lists require less code and fewer edge cases than Red-Black trees.

Memory locality: Skip lists can have worse cache performance than trees because nodes aren’t necessarily contiguous. For cache-sensitive applications, consider this tradeoff.

The probability parameter offers a tuning knob. With p=0.5, you get the classic analysis. Lower values (p=0.25) create fewer levels, saving memory at the cost of slightly longer searches. Higher values create more levels, speeding searches but using more memory.

Skip lists prove that sometimes the probabilistic approach beats the deterministic one—not in worst-case guarantees, but in implementation simplicity and practical performance. When Redis needed a sorted set implementation, they didn’t reach for a Red-Black tree. They reached for a skip list.