Radix Sort: Non-Comparison Integer Sorting
Every computer science student learns that comparison-based sorting algorithms have a theoretical lower bound of O(n log n). This isn't a limitation of our algorithms—it's a mathematical certainty...
Key Insights
- Radix sort achieves O(nk) time complexity by avoiding comparisons entirely, processing integers digit-by-digit using counting sort as a stable subroutine
- Choosing the right radix (base) dramatically affects performance—base-256 often outperforms base-10 by reducing passes while maintaining reasonable memory overhead
- Radix sort excels with fixed-length integers and strings but requires careful handling for negative numbers and isn’t suitable for floating-point data without transformation
Breaking the O(n log n) Barrier
Every computer science student learns that comparison-based sorting algorithms have a theoretical lower bound of O(n log n). This isn’t a limitation of our algorithms—it’s a mathematical certainty proven through decision tree analysis. Quicksort, mergesort, and heapsort all bump against this ceiling.
But what if we stopped comparing elements altogether?
Radix sort sidesteps the comparison model entirely. Instead of asking “is A greater than B?”, it examines the digits of each number and groups them accordingly. This non-comparison approach yields O(nk) time complexity, where n is the number of elements and k is the number of digits (or the key length).
The algorithm dates back to Herman Hollerith’s punch card tabulating machines in the 1880s. Operators would sort cards by examining one column at a time, starting from the rightmost digit. This mechanical process is exactly how LSD radix sort works today—we’ve just replaced punch cards with memory addresses.
How Radix Sort Works
Radix sort processes numbers digit by digit, either from the least significant digit (LSD) to the most significant, or vice versa (MSD). The LSD approach is more common for integers because it’s simpler to implement and naturally handles variable-length numbers.
The critical insight is that each digit-sorting pass must be stable—elements with equal keys maintain their relative order. This stability allows information from previous passes to be preserved.
Let’s trace through sorting [170, 45, 75, 90, 802, 24, 2, 66]:
Original: [170, 45, 75, 90, 802, 24, 2, 66]
Pass 1 (ones place):
0: 170, 90
2: 802, 2
4: 24
5: 45, 75
6: 66
Result: [170, 90, 802, 2, 24, 45, 75, 66]
Pass 2 (tens place):
0: 802, 2
2: 24
4: 45
6: 66
7: 170, 75
9: 90
Result: [802, 2, 24, 45, 66, 170, 75, 90]
Pass 3 (hundreds place):
0: 2, 24, 45, 66, 75, 90
1: 170
8: 802
Result: [2, 24, 45, 66, 75, 90, 170, 802]
Notice how stability matters: after pass 1, both 170 and 90 have 0 in the ones place, but 170 comes first because it appeared first in the input. This ordering is preserved through subsequent passes.
Counting Sort as a Subroutine
Radix sort delegates the actual digit-sorting to counting sort, a simple algorithm that achieves O(n + k) complexity where k is the range of possible values (0-9 for decimal digits).
Counting sort works by:
- Counting occurrences of each digit
- Computing cumulative counts to determine positions
- Placing elements in their final positions
Here’s a counting sort implementation for a single digit position:
def counting_sort_by_digit(arr: list[int], exp: int) -> list[int]:
"""
Sort array by a specific digit position.
exp: 1 for ones, 10 for tens, 100 for hundreds, etc.
"""
n = len(arr)
output = [0] * n
count = [0] * 10 # digits 0-9
# Count occurrences of each digit
for num in arr:
digit = (num // exp) % 10
count[digit] += 1
# Convert counts to cumulative positions
for i in range(1, 10):
count[i] += count[i - 1]
# Build output array (iterate backwards for stability)
for i in range(n - 1, -1, -1):
digit = (arr[i] // exp) % 10
count[digit] -= 1
output[count[digit]] = arr[i]
return output
The backwards iteration in the final loop is crucial for stability. By processing elements from right to left, we ensure that elements with equal digits maintain their original relative order.
Implementation: LSD Radix Sort
With counting sort as our building block, the full LSD radix sort becomes straightforward:
def radix_sort(arr: list[int]) -> list[int]:
"""
LSD Radix Sort for non-negative integers.
Time: O(d * (n + k)) where d = digits, k = radix (10)
Space: O(n + k)
"""
if not arr:
return arr
# Find the maximum number to determine digit count
max_val = max(arr)
# Process each digit position
exp = 1
result = arr.copy()
while max_val // exp > 0:
result = counting_sort_by_digit(result, exp)
exp *= 10
return result
def counting_sort_by_digit(arr: list[int], exp: int) -> list[int]:
n = len(arr)
output = [0] * n
count = [0] * 10
for num in arr:
digit = (num // exp) % 10
count[digit] += 1
for i in range(1, 10):
count[i] += count[i - 1]
for i in range(n - 1, -1, -1):
digit = (arr[i] // exp) % 10
count[digit] -= 1
output[count[digit]] = arr[i]
return output
# Usage
numbers = [170, 45, 75, 90, 802, 24, 2, 66]
sorted_numbers = radix_sort(numbers)
print(sorted_numbers) # [2, 24, 45, 66, 75, 90, 170, 802]
Here’s the equivalent in JavaScript for those working in that ecosystem:
function radixSort(arr) {
if (arr.length === 0) return arr;
const maxVal = Math.max(...arr);
let result = [...arr];
for (let exp = 1; Math.floor(maxVal / exp) > 0; exp *= 10) {
result = countingSortByDigit(result, exp);
}
return result;
}
function countingSortByDigit(arr, exp) {
const n = arr.length;
const output = new Array(n);
const count = new Array(10).fill(0);
for (const num of arr) {
const digit = Math.floor(num / exp) % 10;
count[digit]++;
}
for (let i = 1; i < 10; i++) {
count[i] += count[i - 1];
}
for (let i = n - 1; i >= 0; i--) {
const digit = Math.floor(arr[i] / exp) % 10;
count[digit]--;
output[count[digit]] = arr[i];
}
return output;
}
Optimizing with Different Radix Values
Using base-10 is intuitive but inefficient. Each pass only distinguishes between 10 buckets, meaning a 32-bit integer requires up to 10 passes. By increasing the radix, we reduce passes at the cost of more memory for counting arrays.
Base-256 (2^8) is a sweet spot for 32-bit integers: only 4 passes needed, and the counting array fits comfortably in L1 cache. Bit manipulation makes digit extraction fast:
def radix_sort_base256(arr: list[int]) -> list[int]:
"""
Radix sort using base-256 for better performance.
Only 4 passes needed for 32-bit integers.
"""
if not arr:
return arr
result = arr.copy()
# Process each byte (8 bits at a time)
for shift in range(0, 32, 8):
result = counting_sort_byte(result, shift)
return result
def counting_sort_byte(arr: list[int], shift: int) -> list[int]:
n = len(arr)
output = [0] * n
count = [0] * 256 # 2^8 possible byte values
# Count occurrences
for num in arr:
byte_val = (num >> shift) & 0xFF
count[byte_val] += 1
# Cumulative sum
for i in range(1, 256):
count[i] += count[i - 1]
# Build output (backwards for stability)
for i in range(n - 1, -1, -1):
byte_val = (arr[i] >> shift) & 0xFF
count[byte_val] -= 1
output[count[byte_val]] = arr[i]
return output
The trade-off matrix looks like this:
| Radix | Passes (32-bit) | Count Array Size | Cache Friendly |
|---|---|---|---|
| 10 | 10 | 40 bytes | Yes |
| 256 | 4 | 1 KB | Yes |
| 65536 | 2 | 256 KB | Borderline |
For most applications, base-256 offers the best balance. Base-65536 can be faster for very large arrays where the reduced passes outweigh cache misses.
When to Use Radix Sort
Radix sort shines in specific scenarios:
Ideal use cases:
- Sorting fixed-width integers (IDs, timestamps, hash values)
- String sorting with fixed or bounded lengths
- When you need guaranteed O(n) performance (no worst-case degradation)
- Large datasets where quicksort’s O(n²) worst case is unacceptable
Limitations to consider:
- Negative numbers require transformation (offset or separate handling)
- Floating-point numbers need bit-level manipulation to sort correctly
- Variable-length strings complicate the implementation
- Small arrays favor simpler algorithms due to overhead
Here’s a quick benchmark comparison:
import random
import time
def benchmark(sort_func, arr, name):
test_arr = arr.copy()
start = time.perf_counter()
sort_func(test_arr)
elapsed = time.perf_counter() - start
print(f"{name}: {elapsed:.4f}s")
# Generate test data
sizes = [100_000, 1_000_000]
for size in sizes:
print(f"\n--- {size:,} integers (0 to 1,000,000) ---")
arr = [random.randint(0, 1_000_000) for _ in range(size)]
benchmark(lambda x: x.sort(), arr, "Timsort (built-in)")
benchmark(radix_sort_base256, arr, "Radix Sort (base-256)")
On my machine, radix sort consistently beats Python’s Timsort for arrays over 100,000 integers with bounded ranges. The gap widens as array size increases.
Practical Applications
Radix sort appears in production systems more often than you might expect:
Database indexing: Suffix arrays, used for full-text search, are often constructed using radix sort variants. The DC3 algorithm for suffix array construction relies heavily on radix sort’s linear-time guarantee.
Network applications: Sorting IP addresses (32-bit integers) and MAC addresses (48-bit) maps perfectly to radix sort. Network monitoring tools processing millions of packets benefit from predictable performance.
Graphics engines: Depth sorting for transparency rendering often uses radix sort. When you need to sort millions of fragments per frame at 60 FPS, the consistent O(n) behavior matters more than theoretical constants.
Parallel implementations: Radix sort parallelizes exceptionally well. Each counting pass can be split across threads with a final merge step. GPU implementations achieve remarkable throughput—NVIDIA’s CUB library includes a highly optimized parallel radix sort that processes billions of keys per second.
The algorithm’s simplicity and predictability make it a workhorse for systems where performance consistency matters as much as raw speed. When you’re processing fixed-format data at scale, radix sort deserves a place in your toolkit.