Cycle Sort: Minimum Write Sorting
Most sorting algorithm discussions focus on comparison counts and time complexity. We obsess over whether quicksort beats mergesort by a constant factor, while ignoring a metric that matters...
Key Insights
- Cycle sort achieves the theoretical minimum number of writes for any in-place sorting algorithm—exactly n writes minus the number of cycles in the permutation
- The algorithm exploits the mathematical property that every permutation can be decomposed into disjoint cycles, placing each element directly into its final position
- While its O(n²) time complexity makes it unsuitable for general-purpose sorting, cycle sort becomes invaluable when write operations are expensive (flash memory, EEPROM, SSD wear leveling)
Introduction to Cycle Sort
Most sorting algorithm discussions focus on comparison counts and time complexity. We obsess over whether quicksort beats mergesort by a constant factor, while ignoring a metric that matters enormously in certain contexts: write operations.
Cycle sort, developed by B.K. Haddon in 1990, takes a different approach. It asks a simple question: what’s the minimum number of writes needed to sort an array in place? The answer is surprisingly elegant—each element needs to move at most once to its final position.
This matters more than you might think. Flash memory cells degrade after a limited number of write cycles (typically 10,000-100,000 for consumer SSDs). EEPROM in embedded systems often has even stricter limits. When you’re writing firmware for a device that needs to sort sensor readings and store them to flash, the difference between O(n log n) writes and O(n) writes translates directly to hardware lifespan.
The Core Concept: Cycle Detection
The mathematical foundation of cycle sort comes from permutation theory. Consider any unsorted array as a permutation of its sorted form. Every permutation can be decomposed into disjoint cycles.
Here’s the intuition: pick any element. It belongs somewhere specific in the sorted array. The element currently occupying that position also belongs somewhere else. Follow this chain, and you’ll eventually return to your starting element. That’s a cycle.
def visualize_cycles(arr):
"""Demonstrate cycle detection in a permutation."""
sorted_arr = sorted(arr)
position_map = {v: i for i, v in enumerate(sorted_arr)}
visited = [False] * len(arr)
cycles = []
for start in range(len(arr)):
if visited[start]:
continue
cycle = []
current = start
while not visited[current]:
visited[current] = True
cycle.append((current, arr[current]))
# Where does this element need to go?
current = position_map[arr[current]]
if len(cycle) > 1: # Single-element cycles are already in place
cycles.append(cycle)
return cycles
# Example: [4, 2, 1, 3] should become [1, 2, 3, 4]
arr = [4, 2, 1, 3]
cycles = visualize_cycles(arr)
for i, cycle in enumerate(cycles):
print(f"Cycle {i + 1}: {' -> '.join(f'{v}@{p}' for p, v in cycle)}")
Running this reveals the structure hidden in the unsorted array. Element 4 at position 0 belongs at position 3. Element 3 at position 3 belongs at position 2. Element 1 at position 2 belongs at position 0. That’s one cycle: 4 → 3 → 1 → 4. Element 2 is already in its correct position—a trivial cycle of length 1.
Algorithm Walkthrough
The cycle sort algorithm operationalizes this insight. For each position, we find the element that belongs there and rotate through the cycle until every element in that cycle reaches its destination.
def cycle_sort(arr):
"""
Sort array in-place using cycle sort.
Returns the number of writes performed.
"""
writes = 0
n = len(arr)
# Traverse each cycle starting position
for cycle_start in range(n - 1):
item = arr[cycle_start]
# Find the correct position for this item
# by counting elements smaller than it
pos = cycle_start
for i in range(cycle_start + 1, n):
if arr[i] < item:
pos += 1
# Item is already in the correct position
if pos == cycle_start:
continue
# Skip duplicates - find the next available slot
while item == arr[pos]:
pos += 1
# Place the item in its correct position
arr[pos], item = item, arr[pos]
writes += 1
# Rotate the rest of the cycle
while pos != cycle_start:
pos = cycle_start
# Find position for the displaced item
for i in range(cycle_start + 1, n):
if arr[i] < item:
pos += 1
# Skip duplicates
while item == arr[pos]:
pos += 1
# Place and continue
arr[pos], item = item, arr[pos]
writes += 1
return writes
# Demonstration
test = [4, 2, 1, 3, 5]
writes = cycle_sort(test)
print(f"Sorted: {test}")
print(f"Total writes: {writes}")
The key insight is counting elements smaller than our current item to determine its sorted position. This avoids needing a separate sorted reference array. Each swap places exactly one element in its final position, never to be moved again.
Complexity Analysis
Let’s be precise about the costs:
Time Complexity: O(n²) comparisons. For each of the n positions, we potentially scan the remaining array to count smaller elements. This nested iteration gives quadratic behavior.
Space Complexity: O(1). We use only a constant number of variables regardless of input size.
Write Complexity: O(n) writes, and specifically at most n - c writes where c is the number of cycles in the permutation. This is provably optimal for any in-place comparison sort.
def instrumented_cycle_sort(arr):
"""Cycle sort with detailed operation counting."""
writes = 0
comparisons = 0
n = len(arr)
for cycle_start in range(n - 1):
item = arr[cycle_start]
pos = cycle_start
for i in range(cycle_start + 1, n):
comparisons += 1
if arr[i] < item:
pos += 1
if pos == cycle_start:
continue
while item == arr[pos]:
comparisons += 1
pos += 1
arr[pos], item = item, arr[pos]
writes += 1
while pos != cycle_start:
pos = cycle_start
for i in range(cycle_start + 1, n):
comparisons += 1
if arr[i] < item:
pos += 1
while item == arr[pos]:
comparisons += 1
pos += 1
arr[pos], item = item, arr[pos]
writes += 1
return {"writes": writes, "comparisons": comparisons}
# Compare with array size
import random
for size in [10, 100, 1000]:
arr = list(range(size))
random.shuffle(arr)
stats = instrumented_cycle_sort(arr)
print(f"n={size}: writes={stats['writes']}, comparisons={stats['comparisons']}")
print(f" writes/n = {stats['writes']/size:.2f}, comparisons/n² = {stats['comparisons']/(size*size):.2f}")
Handling Duplicates
Duplicates introduce a subtle bug if not handled correctly. When multiple elements share the same value, they all compute the same target position. Without adjustment, we’d enter an infinite loop or corrupt data.
The solution: when placing an element, skip past any duplicates already occupying that position range.
def cycle_sort_stable_duplicates(arr):
"""
Cycle sort handling duplicates correctly.
Note: This version is NOT stable - equal elements may be reordered.
"""
writes = 0
n = len(arr)
for cycle_start in range(n - 1):
item = arr[cycle_start]
pos = cycle_start
# Count ALL elements less than item (not just from cycle_start)
for i in range(cycle_start + 1, n):
if arr[i] < item:
pos += 1
if pos == cycle_start:
continue
# Critical: skip past duplicates to find empty slot
while item == arr[pos]:
pos += 1
if pos != cycle_start:
arr[pos], item = item, arr[pos]
writes += 1
while pos != cycle_start:
pos = cycle_start
for i in range(cycle_start + 1, n):
if arr[i] < item:
pos += 1
while item == arr[pos]:
pos += 1
if item != arr[pos]:
arr[pos], item = item, arr[pos]
writes += 1
return writes
# Test with duplicates
test_dupes = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
writes = cycle_sort_stable_duplicates(test_dupes)
print(f"Sorted: {test_dupes}")
print(f"Writes: {writes}")
The while item == arr[pos] loop is the critical addition. It ensures we place each duplicate in a distinct position, maintaining correctness at the cost of additional comparisons.
Practical Applications
Use cycle sort when:
- Writing to flash memory, EEPROM, or other media with limited write endurance
- Sorting small datasets where O(n²) time is acceptable
- Write operations are orders of magnitude more expensive than comparisons
- You need an in-place algorithm with minimal memory overhead
- Implementing wear-leveling algorithms or similar hardware-conscious code
Avoid cycle sort when:
- Performance is the primary concern (use quicksort, mergesort, or timsort)
- Working with large datasets where O(n²) becomes prohibitive
- Write cost is comparable to comparison cost (most RAM operations)
- You need a stable sort (cycle sort reorders equal elements)
Comparison with Other Algorithms
Here’s how cycle sort stacks up against other simple sorting algorithms:
def selection_sort_instrumented(arr):
writes = 0
n = len(arr)
for i in range(n):
min_idx = i
for j in range(i + 1, n):
if arr[j] < arr[min_idx]:
min_idx = j
if min_idx != i:
arr[i], arr[min_idx] = arr[min_idx], arr[i]
writes += 2 # Two writes per swap
return writes
def insertion_sort_instrumented(arr):
writes = 0
for i in range(1, len(arr)):
key = arr[i]
j = i - 1
while j >= 0 and arr[j] > key:
arr[j + 1] = arr[j]
writes += 1
j -= 1
arr[j + 1] = key
writes += 1
return writes
# Benchmark comparison
import random
print(f"{'Algorithm':<20} {'n=100':<12} {'n=500':<12} {'n=1000':<12}")
print("-" * 56)
for name, sort_func in [
("Cycle Sort", instrumented_cycle_sort),
("Selection Sort", selection_sort_instrumented),
("Insertion Sort", insertion_sort_instrumented),
]:
results = []
for size in [100, 500, 1000]:
arr = list(range(size))
random.shuffle(arr)
if name == "Cycle Sort":
writes = sort_func(arr)["writes"]
else:
writes = sort_func(arr)
results.append(writes)
print(f"{name:<20} {results[0]:<12} {results[1]:<12} {results[2]:<12}")
| Algorithm | Time Complexity | Write Complexity | Space |
|---|---|---|---|
| Cycle Sort | O(n²) | O(n) optimal | O(1) |
| Selection Sort | O(n²) | O(n) | O(1) |
| Insertion Sort | O(n²) | O(n²) worst case | O(1) |
| Quicksort | O(n log n) avg | O(n log n) | O(log n) |
Selection sort comes close on writes but still performs up to 2n writes (two per swap). Cycle sort’s guarantee of at most n writes—and often fewer—makes it the clear winner when writes are the bottleneck.
Cycle sort occupies a narrow but important niche. It’s not your everyday sorting algorithm, but when you’re counting write cycles on embedded flash or optimizing for SSD longevity, it’s the mathematically optimal choice. Keep it in your toolbox for those specific situations where every write counts.