Radix Sort: Non-Comparison Integer Sorting

Every computer science student learns that comparison-based sorting algorithms have a theoretical lower bound of O(n log n). This isn't a limitation of our algorithms—it's a mathematical certainty...

Key Insights

  • Radix sort achieves O(nk) time complexity by avoiding comparisons entirely, processing integers digit-by-digit using counting sort as a stable subroutine
  • Choosing the right radix (base) dramatically affects performance—base-256 often outperforms base-10 by reducing passes while maintaining reasonable memory overhead
  • Radix sort excels with fixed-length integers and strings but requires careful handling for negative numbers and isn’t suitable for floating-point data without transformation

Breaking the O(n log n) Barrier

Every computer science student learns that comparison-based sorting algorithms have a theoretical lower bound of O(n log n). This isn’t a limitation of our algorithms—it’s a mathematical certainty proven through decision tree analysis. Quicksort, mergesort, and heapsort all bump against this ceiling.

But what if we stopped comparing elements altogether?

Radix sort sidesteps the comparison model entirely. Instead of asking “is A greater than B?”, it examines the digits of each number and groups them accordingly. This non-comparison approach yields O(nk) time complexity, where n is the number of elements and k is the number of digits (or the key length).

The algorithm dates back to Herman Hollerith’s punch card tabulating machines in the 1880s. Operators would sort cards by examining one column at a time, starting from the rightmost digit. This mechanical process is exactly how LSD radix sort works today—we’ve just replaced punch cards with memory addresses.

How Radix Sort Works

Radix sort processes numbers digit by digit, either from the least significant digit (LSD) to the most significant, or vice versa (MSD). The LSD approach is more common for integers because it’s simpler to implement and naturally handles variable-length numbers.

The critical insight is that each digit-sorting pass must be stable—elements with equal keys maintain their relative order. This stability allows information from previous passes to be preserved.

Let’s trace through sorting [170, 45, 75, 90, 802, 24, 2, 66]:

Original:  [170, 45, 75, 90, 802, 24, 2, 66]

Pass 1 (ones place):
  0: 170, 90
  2: 802, 2
  4: 24
  5: 45, 75
  6: 66
Result:    [170, 90, 802, 2, 24, 45, 75, 66]

Pass 2 (tens place):
  0: 802, 2
  2: 24
  4: 45
  6: 66
  7: 170, 75
  9: 90
Result:    [802, 2, 24, 45, 66, 170, 75, 90]

Pass 3 (hundreds place):
  0: 2, 24, 45, 66, 75, 90
  1: 170
  8: 802
Result:    [2, 24, 45, 66, 75, 90, 170, 802]

Notice how stability matters: after pass 1, both 170 and 90 have 0 in the ones place, but 170 comes first because it appeared first in the input. This ordering is preserved through subsequent passes.

Counting Sort as a Subroutine

Radix sort delegates the actual digit-sorting to counting sort, a simple algorithm that achieves O(n + k) complexity where k is the range of possible values (0-9 for decimal digits).

Counting sort works by:

  1. Counting occurrences of each digit
  2. Computing cumulative counts to determine positions
  3. Placing elements in their final positions

Here’s a counting sort implementation for a single digit position:

def counting_sort_by_digit(arr: list[int], exp: int) -> list[int]:
    """
    Sort array by a specific digit position.
    exp: 1 for ones, 10 for tens, 100 for hundreds, etc.
    """
    n = len(arr)
    output = [0] * n
    count = [0] * 10  # digits 0-9
    
    # Count occurrences of each digit
    for num in arr:
        digit = (num // exp) % 10
        count[digit] += 1
    
    # Convert counts to cumulative positions
    for i in range(1, 10):
        count[i] += count[i - 1]
    
    # Build output array (iterate backwards for stability)
    for i in range(n - 1, -1, -1):
        digit = (arr[i] // exp) % 10
        count[digit] -= 1
        output[count[digit]] = arr[i]
    
    return output

The backwards iteration in the final loop is crucial for stability. By processing elements from right to left, we ensure that elements with equal digits maintain their original relative order.

Implementation: LSD Radix Sort

With counting sort as our building block, the full LSD radix sort becomes straightforward:

def radix_sort(arr: list[int]) -> list[int]:
    """
    LSD Radix Sort for non-negative integers.
    Time: O(d * (n + k)) where d = digits, k = radix (10)
    Space: O(n + k)
    """
    if not arr:
        return arr
    
    # Find the maximum number to determine digit count
    max_val = max(arr)
    
    # Process each digit position
    exp = 1
    result = arr.copy()
    
    while max_val // exp > 0:
        result = counting_sort_by_digit(result, exp)
        exp *= 10
    
    return result


def counting_sort_by_digit(arr: list[int], exp: int) -> list[int]:
    n = len(arr)
    output = [0] * n
    count = [0] * 10
    
    for num in arr:
        digit = (num // exp) % 10
        count[digit] += 1
    
    for i in range(1, 10):
        count[i] += count[i - 1]
    
    for i in range(n - 1, -1, -1):
        digit = (arr[i] // exp) % 10
        count[digit] -= 1
        output[count[digit]] = arr[i]
    
    return output


# Usage
numbers = [170, 45, 75, 90, 802, 24, 2, 66]
sorted_numbers = radix_sort(numbers)
print(sorted_numbers)  # [2, 24, 45, 66, 75, 90, 170, 802]

Here’s the equivalent in JavaScript for those working in that ecosystem:

function radixSort(arr) {
    if (arr.length === 0) return arr;
    
    const maxVal = Math.max(...arr);
    let result = [...arr];
    
    for (let exp = 1; Math.floor(maxVal / exp) > 0; exp *= 10) {
        result = countingSortByDigit(result, exp);
    }
    
    return result;
}

function countingSortByDigit(arr, exp) {
    const n = arr.length;
    const output = new Array(n);
    const count = new Array(10).fill(0);
    
    for (const num of arr) {
        const digit = Math.floor(num / exp) % 10;
        count[digit]++;
    }
    
    for (let i = 1; i < 10; i++) {
        count[i] += count[i - 1];
    }
    
    for (let i = n - 1; i >= 0; i--) {
        const digit = Math.floor(arr[i] / exp) % 10;
        count[digit]--;
        output[count[digit]] = arr[i];
    }
    
    return output;
}

Optimizing with Different Radix Values

Using base-10 is intuitive but inefficient. Each pass only distinguishes between 10 buckets, meaning a 32-bit integer requires up to 10 passes. By increasing the radix, we reduce passes at the cost of more memory for counting arrays.

Base-256 (2^8) is a sweet spot for 32-bit integers: only 4 passes needed, and the counting array fits comfortably in L1 cache. Bit manipulation makes digit extraction fast:

def radix_sort_base256(arr: list[int]) -> list[int]:
    """
    Radix sort using base-256 for better performance.
    Only 4 passes needed for 32-bit integers.
    """
    if not arr:
        return arr
    
    result = arr.copy()
    
    # Process each byte (8 bits at a time)
    for shift in range(0, 32, 8):
        result = counting_sort_byte(result, shift)
    
    return result


def counting_sort_byte(arr: list[int], shift: int) -> list[int]:
    n = len(arr)
    output = [0] * n
    count = [0] * 256  # 2^8 possible byte values
    
    # Count occurrences
    for num in arr:
        byte_val = (num >> shift) & 0xFF
        count[byte_val] += 1
    
    # Cumulative sum
    for i in range(1, 256):
        count[i] += count[i - 1]
    
    # Build output (backwards for stability)
    for i in range(n - 1, -1, -1):
        byte_val = (arr[i] >> shift) & 0xFF
        count[byte_val] -= 1
        output[count[byte_val]] = arr[i]
    
    return output

The trade-off matrix looks like this:

Radix Passes (32-bit) Count Array Size Cache Friendly
10 10 40 bytes Yes
256 4 1 KB Yes
65536 2 256 KB Borderline

For most applications, base-256 offers the best balance. Base-65536 can be faster for very large arrays where the reduced passes outweigh cache misses.

When to Use Radix Sort

Radix sort shines in specific scenarios:

Ideal use cases:

  • Sorting fixed-width integers (IDs, timestamps, hash values)
  • String sorting with fixed or bounded lengths
  • When you need guaranteed O(n) performance (no worst-case degradation)
  • Large datasets where quicksort’s O(n²) worst case is unacceptable

Limitations to consider:

  • Negative numbers require transformation (offset or separate handling)
  • Floating-point numbers need bit-level manipulation to sort correctly
  • Variable-length strings complicate the implementation
  • Small arrays favor simpler algorithms due to overhead

Here’s a quick benchmark comparison:

import random
import time

def benchmark(sort_func, arr, name):
    test_arr = arr.copy()
    start = time.perf_counter()
    sort_func(test_arr)
    elapsed = time.perf_counter() - start
    print(f"{name}: {elapsed:.4f}s")

# Generate test data
sizes = [100_000, 1_000_000]
for size in sizes:
    print(f"\n--- {size:,} integers (0 to 1,000,000) ---")
    arr = [random.randint(0, 1_000_000) for _ in range(size)]
    
    benchmark(lambda x: x.sort(), arr, "Timsort (built-in)")
    benchmark(radix_sort_base256, arr, "Radix Sort (base-256)")

On my machine, radix sort consistently beats Python’s Timsort for arrays over 100,000 integers with bounded ranges. The gap widens as array size increases.

Practical Applications

Radix sort appears in production systems more often than you might expect:

Database indexing: Suffix arrays, used for full-text search, are often constructed using radix sort variants. The DC3 algorithm for suffix array construction relies heavily on radix sort’s linear-time guarantee.

Network applications: Sorting IP addresses (32-bit integers) and MAC addresses (48-bit) maps perfectly to radix sort. Network monitoring tools processing millions of packets benefit from predictable performance.

Graphics engines: Depth sorting for transparency rendering often uses radix sort. When you need to sort millions of fragments per frame at 60 FPS, the consistent O(n) behavior matters more than theoretical constants.

Parallel implementations: Radix sort parallelizes exceptionally well. Each counting pass can be split across threads with a final merge step. GPU implementations achieve remarkable throughput—NVIDIA’s CUB library includes a highly optimized parallel radix sort that processes billions of keys per second.

The algorithm’s simplicity and predictability make it a workhorse for systems where performance consistency matters as much as raw speed. When you’re processing fixed-format data at scale, radix sort deserves a place in your toolkit.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.