Cycle Sort: Minimum Write Sorting

Key Insights

Cycle sort achieves the theoretical minimum number of writes for any in-place sorting algorithm—exactly n writes minus the number of cycles in the permutation
The algorithm exploits the mathematical property that every permutation can be decomposed into disjoint cycles, placing each element directly into its final position
While its O(n²) time complexity makes it unsuitable for general-purpose sorting, cycle sort becomes invaluable when write operations are expensive (flash memory, EEPROM, SSD wear leveling)

Introduction to Cycle Sort

Most sorting algorithm discussions focus on comparison counts and time complexity. We obsess over whether quicksort beats mergesort by a constant factor, while ignoring a metric that matters enormously in certain contexts: write operations.

Cycle sort, developed by B.K. Haddon in 1990, takes a different approach. It asks a simple question: what’s the minimum number of writes needed to sort an array in place? The answer is surprisingly elegant—each element needs to move at most once to its final position.

This matters more than you might think. Flash memory cells degrade after a limited number of write cycles (typically 10,000-100,000 for consumer SSDs). EEPROM in embedded systems often has even stricter limits. When you’re writing firmware for a device that needs to sort sensor readings and store them to flash, the difference between O(n log n) writes and O(n) writes translates directly to hardware lifespan.

The Core Concept: Cycle Detection

The mathematical foundation of cycle sort comes from permutation theory. Consider any unsorted array as a permutation of its sorted form. Every permutation can be decomposed into disjoint cycles.

Here’s the intuition: pick any element. It belongs somewhere specific in the sorted array. The element currently occupying that position also belongs somewhere else. Follow this chain, and you’ll eventually return to your starting element. That’s a cycle.

def visualize_cycles(arr):
    """Demonstrate cycle detection in a permutation."""
    sorted_arr = sorted(arr)
    position_map = {v: i for i, v in enumerate(sorted_arr)}
    
    visited = [False] * len(arr)
    cycles = []
    
    for start in range(len(arr)):
        if visited[start]:
            continue
            
        cycle = []
        current = start
        
        while not visited[current]:
            visited[current] = True
            cycle.append((current, arr[current]))
            # Where does this element need to go?
            current = position_map[arr[current]]
        
        if len(cycle) > 1:  # Single-element cycles are already in place
            cycles.append(cycle)
    
    return cycles

# Example: [4, 2, 1, 3] should become [1, 2, 3, 4]
arr = [4, 2, 1, 3]
cycles = visualize_cycles(arr)
for i, cycle in enumerate(cycles):
    print(f"Cycle {i + 1}: {' -> '.join(f'{v}@{p}' for p, v in cycle)}")

Running this reveals the structure hidden in the unsorted array. Element 4 at position 0 belongs at position 3. Element 3 at position 3 belongs at position 2. Element 1 at position 2 belongs at position 0. That’s one cycle: 4 → 3 → 1 → 4. Element 2 is already in its correct position—a trivial cycle of length 1.

Algorithm Walkthrough

The cycle sort algorithm operationalizes this insight. For each position, we find the element that belongs there and rotate through the cycle until every element in that cycle reaches its destination.

def cycle_sort(arr):
    """
    Sort array in-place using cycle sort.
    Returns the number of writes performed.
    """
    writes = 0
    n = len(arr)
    
    # Traverse each cycle starting position
    for cycle_start in range(n - 1):
        item = arr[cycle_start]
        
        # Find the correct position for this item
        # by counting elements smaller than it
        pos = cycle_start
        for i in range(cycle_start + 1, n):
            if arr[i] < item:
                pos += 1
        
        # Item is already in the correct position
        if pos == cycle_start:
            continue
        
        # Skip duplicates - find the next available slot
        while item == arr[pos]:
            pos += 1
        
        # Place the item in its correct position
        arr[pos], item = item, arr[pos]
        writes += 1
        
        # Rotate the rest of the cycle
        while pos != cycle_start:
            pos = cycle_start
            
            # Find position for the displaced item
            for i in range(cycle_start + 1, n):
                if arr[i] < item:
                    pos += 1
            
            # Skip duplicates
            while item == arr[pos]:
                pos += 1
            
            # Place and continue
            arr[pos], item = item, arr[pos]
            writes += 1
    
    return writes

# Demonstration
test = [4, 2, 1, 3, 5]
writes = cycle_sort(test)
print(f"Sorted: {test}")
print(f"Total writes: {writes}")

The key insight is counting elements smaller than our current item to determine its sorted position. This avoids needing a separate sorted reference array. Each swap places exactly one element in its final position, never to be moved again.

Complexity Analysis

Let’s be precise about the costs:

Time Complexity: O(n²) comparisons. For each of the n positions, we potentially scan the remaining array to count smaller elements. This nested iteration gives quadratic behavior.

Space Complexity: O(1). We use only a constant number of variables regardless of input size.

Write Complexity: O(n) writes, and specifically at most n - c writes where c is the number of cycles in the permutation. This is provably optimal for any in-place comparison sort.

def instrumented_cycle_sort(arr):
    """Cycle sort with detailed operation counting."""
    writes = 0
    comparisons = 0
    n = len(arr)
    
    for cycle_start in range(n - 1):
        item = arr[cycle_start]
        pos = cycle_start
        
        for i in range(cycle_start + 1, n):
            comparisons += 1
            if arr[i] < item:
                pos += 1
        
        if pos == cycle_start:
            continue
        
        while item == arr[pos]:
            comparisons += 1
            pos += 1
        
        arr[pos], item = item, arr[pos]
        writes += 1
        
        while pos != cycle_start:
            pos = cycle_start
            for i in range(cycle_start + 1, n):
                comparisons += 1
                if arr[i] < item:
                    pos += 1
            
            while item == arr[pos]:
                comparisons += 1
                pos += 1
            
            arr[pos], item = item, arr[pos]
            writes += 1
    
    return {"writes": writes, "comparisons": comparisons}

# Compare with array size
import random
for size in [10, 100, 1000]:
    arr = list(range(size))
    random.shuffle(arr)
    stats = instrumented_cycle_sort(arr)
    print(f"n={size}: writes={stats['writes']}, comparisons={stats['comparisons']}")
    print(f"  writes/n = {stats['writes']/size:.2f}, comparisons/n² = {stats['comparisons']/(size*size):.2f}")

Handling Duplicates

Duplicates introduce a subtle bug if not handled correctly. When multiple elements share the same value, they all compute the same target position. Without adjustment, we’d enter an infinite loop or corrupt data.

The solution: when placing an element, skip past any duplicates already occupying that position range.

def cycle_sort_stable_duplicates(arr):
    """
    Cycle sort handling duplicates correctly.
    Note: This version is NOT stable - equal elements may be reordered.
    """
    writes = 0
    n = len(arr)
    
    for cycle_start in range(n - 1):
        item = arr[cycle_start]
        pos = cycle_start
        
        # Count ALL elements less than item (not just from cycle_start)
        for i in range(cycle_start + 1, n):
            if arr[i] < item:
                pos += 1
        
        if pos == cycle_start:
            continue
        
        # Critical: skip past duplicates to find empty slot
        while item == arr[pos]:
            pos += 1
        
        if pos != cycle_start:
            arr[pos], item = item, arr[pos]
            writes += 1
        
        while pos != cycle_start:
            pos = cycle_start
            for i in range(cycle_start + 1, n):
                if arr[i] < item:
                    pos += 1
            
            while item == arr[pos]:
                pos += 1
            
            if item != arr[pos]:
                arr[pos], item = item, arr[pos]
                writes += 1
    
    return writes

# Test with duplicates
test_dupes = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
writes = cycle_sort_stable_duplicates(test_dupes)
print(f"Sorted: {test_dupes}")
print(f"Writes: {writes}")

The while item == arr[pos] loop is the critical addition. It ensures we place each duplicate in a distinct position, maintaining correctness at the cost of additional comparisons.

Practical Applications

Use cycle sort when:

Writing to flash memory, EEPROM, or other media with limited write endurance
Sorting small datasets where O(n²) time is acceptable
Write operations are orders of magnitude more expensive than comparisons
You need an in-place algorithm with minimal memory overhead
Implementing wear-leveling algorithms or similar hardware-conscious code

Avoid cycle sort when:

Performance is the primary concern (use quicksort, mergesort, or timsort)
Working with large datasets where O(n²) becomes prohibitive
Write cost is comparable to comparison cost (most RAM operations)
You need a stable sort (cycle sort reorders equal elements)

Comparison with Other Algorithms

Here’s how cycle sort stacks up against other simple sorting algorithms:

def selection_sort_instrumented(arr):
    writes = 0
    n = len(arr)
    for i in range(n):
        min_idx = i
        for j in range(i + 1, n):
            if arr[j] < arr[min_idx]:
                min_idx = j
        if min_idx != i:
            arr[i], arr[min_idx] = arr[min_idx], arr[i]
            writes += 2  # Two writes per swap
    return writes

def insertion_sort_instrumented(arr):
    writes = 0
    for i in range(1, len(arr)):
        key = arr[i]
        j = i - 1
        while j >= 0 and arr[j] > key:
            arr[j + 1] = arr[j]
            writes += 1
            j -= 1
        arr[j + 1] = key
        writes += 1
    return writes

# Benchmark comparison
import random

print(f"{'Algorithm':<20} {'n=100':<12} {'n=500':<12} {'n=1000':<12}")
print("-" * 56)

for name, sort_func in [
    ("Cycle Sort", instrumented_cycle_sort),
    ("Selection Sort", selection_sort_instrumented),
    ("Insertion Sort", insertion_sort_instrumented),
]:
    results = []
    for size in [100, 500, 1000]:
        arr = list(range(size))
        random.shuffle(arr)
        if name == "Cycle Sort":
            writes = sort_func(arr)["writes"]
        else:
            writes = sort_func(arr)
        results.append(writes)
    print(f"{name:<20} {results[0]:<12} {results[1]:<12} {results[2]:<12}")

Algorithm	Time Complexity	Write Complexity	Space
Cycle Sort	O(n²)	O(n) optimal	O(1)
Selection Sort	O(n²)	O(n)	O(1)
Insertion Sort	O(n²)	O(n²) worst case	O(1)
Quicksort	O(n log n) avg	O(n log n)	O(log n)

Selection sort comes close on writes but still performs up to 2n writes (two per swap). Cycle sort’s guarantee of at most n writes—and often fewer—makes it the clear winner when writes are the bottleneck.

Cycle sort occupies a narrow but important niche. It’s not your everyday sorting algorithm, but when you’re counting write cycles on embedded flash or optimizing for SSD longevity, it’s the mathematically optimal choice. Keep it in your toolbox for those specific situations where every write counts.