Counting Sort: Linear-Time Integer Sorting
Every computer science student learns that comparison-based sorting algorithms have a fundamental lower bound of O(n log n). This isn't a limitation of our creativity—it's a mathematical certainty...
Key Insights
- Counting sort achieves O(n + k) time complexity by avoiding comparisons entirely, instead counting occurrences and calculating positions directly—but only works when the range k of input values is reasonable relative to n.
- Stability is counting sort’s secret weapon: by iterating through the input array in reverse during output construction, it preserves the relative order of equal elements, making it essential as a subroutine in radix sort.
- The algorithm trades space for speed, requiring O(k) auxiliary space for the count array, which makes it impractical when sorting small datasets with large value ranges but unbeatable for bounded integer data.
Breaking the O(n log n) Barrier
Every computer science student learns that comparison-based sorting algorithms have a fundamental lower bound of O(n log n). This isn’t a limitation of our creativity—it’s a mathematical certainty proven through decision tree analysis. Any algorithm that sorts by comparing elements must make at least log₂(n!) comparisons in the worst case.
But what if we don’t compare elements at all?
Counting sort sidesteps this limitation entirely. Instead of asking “is A greater than B?”, it asks “how many elements have this value?” This shift in approach unlocks linear-time sorting for integer data with a bounded range.
The time complexity is O(n + k), where n is the number of elements and k is the range of input values. When k is O(n) or smaller, you get genuine linear-time sorting. This isn’t a theoretical curiosity—it’s a practical tool you should reach for when sorting grades, ages, HTTP status codes, or any bounded integer data.
How Counting Sort Works
The algorithm operates in three distinct phases:
- Count occurrences: Scan the input and count how many times each value appears
- Calculate positions: Transform counts into starting positions in the output array
- Build output: Place each element in its correct position
Let’s trace through a concrete example. Given the array [4, 2, 2, 8, 3, 3, 1]:
Input: [4, 2, 2, 8, 3, 3, 1]
Step 1 - Count occurrences:
Value: 0 1 2 3 4 5 6 7 8
Count: 0 1 2 2 1 0 0 0 1
Step 2 - Cumulative count (positions):
Value: 0 1 2 3 4 5 6 7 8
Count: 0 1 3 5 6 6 6 6 7
Step 3 - Build output (right to left):
Process 1: position = count[1]-1 = 0, output[0] = 1
Process 3: position = count[3]-1 = 4, output[4] = 3
Process 3: position = count[3]-1 = 3, output[3] = 3
...
Output: [1, 2, 2, 3, 3, 4, 8]
Here’s the basic implementation:
def counting_sort(arr: list[int]) -> list[int]:
if not arr:
return []
max_val = max(arr)
count = [0] * (max_val + 1)
# Count occurrences
for num in arr:
count[num] += 1
# Calculate cumulative positions
for i in range(1, len(count)):
count[i] += count[i - 1]
# Build output array
output = [0] * len(arr)
for num in reversed(arr):
count[num] -= 1
output[count[num]] = num
return output
The Algorithm in Detail
Let’s break down each phase with a more thoroughly annotated implementation:
def counting_sort_detailed(arr: list[int]) -> list[int]:
"""
Counting sort with detailed phase annotations.
Assumes non-negative integers.
"""
if not arr:
return []
n = len(arr)
max_val = max(arr)
# PHASE 1: Count occurrences
# count[i] = number of elements equal to i
count = [0] * (max_val + 1)
for num in arr:
count[num] += 1
# After: count[v] tells us how many times v appears
# PHASE 2: Cumulative count transformation
# count[i] = number of elements <= i
# This tells us where elements should END in output
for i in range(1, len(count)):
count[i] += count[i - 1]
# After: count[v] tells us the position AFTER the last v
# PHASE 3: Build sorted output
# Iterate in reverse to maintain stability
output = [0] * n
for i in range(n - 1, -1, -1):
num = arr[i]
# Decrement first: count[num] now points to correct position
count[num] -= 1
output[count[num]] = num
return output
The cumulative count transformation is the clever bit. After this step, count[v] tells you how many elements are less than or equal to v. This directly gives you the position where the last occurrence of v should go in the sorted output (minus one for zero-indexing).
When we decrement before placing, we’re essentially “reserving” positions from right to left for each value. The first v we encounter (scanning right to left) gets the rightmost position for vs, the second gets the next position left, and so on.
Stability and Why It Matters
A sorting algorithm is stable if elements with equal keys maintain their relative order from the input. This sounds academic until you need to sort by multiple criteria.
Consider sorting a list of students first by name, then by grade. If your grade sort isn’t stable, it destroys the name ordering for students with the same grade.
Counting sort achieves stability through one crucial detail: reverse iteration during output construction. By processing the input array from right to left, the last occurrence of each value gets placed last among its duplicates, preserving original order.
from dataclasses import dataclass
@dataclass
class Student:
name: str
grade: int
def counting_sort_by_grade(students: list[Student]) -> list[Student]:
"""
Sort students by grade, maintaining stability.
Grades assumed to be 0-100.
"""
if not students:
return []
max_grade = 100
count = [0] * (max_grade + 1)
# Count by grade
for student in students:
count[student.grade] += 1
# Cumulative count
for i in range(1, len(count)):
count[i] += count[i - 1]
# Build output - reverse iteration for stability
output = [None] * len(students)
for student in reversed(students):
count[student.grade] -= 1
output[count[student.grade]] = student
return output
# Demonstration
students = [
Student("Alice", 85),
Student("Bob", 90),
Student("Charlie", 85), # Same grade as Alice
Student("Diana", 90), # Same grade as Bob
]
sorted_students = counting_sort_by_grade(students)
for s in sorted_students:
print(f"{s.name}: {s.grade}")
# Output:
# Alice: 85 <- Alice before Charlie (stable)
# Charlie: 85
# Bob: 90 <- Bob before Diana (stable)
# Diana: 90
This stability property makes counting sort the backbone of radix sort, which sorts numbers digit by digit. Each digit-level sort must be stable, or the previously sorted digits get scrambled.
Complexity Analysis and Trade-offs
Time Complexity: O(n + k)
- Counting occurrences: O(n)
- Cumulative sum: O(k)
- Building output: O(n)
- Total: O(n + k)
Space Complexity: O(n + k)
- Count array: O(k)
- Output array: O(n)
- Total: O(n + k)
The critical factor is the relationship between n and k. Here’s when counting sort makes sense:
| Scenario | n | k | Verdict |
|---|---|---|---|
| Sorting grades (0-100) | 10,000 | 101 | Excellent |
| Sorting ages | 1,000,000 | 150 | Excellent |
| Sorting 32-bit integers | 1,000 | 4.3 billion | Terrible |
| Sorting HTTP codes | 50,000 | ~600 | Good |
When k » n, counting sort degrades badly. You’re allocating and initializing a massive count array for sparse data. In these cases, comparison-based sorts with O(n log n) complexity are faster despite the higher theoretical complexity.
Quick comparison:
- Counting sort: Best when k ≤ n, stable, not in-place
- Quicksort: O(n log n) average, not stable, in-place
- Mergesort: O(n log n) guaranteed, stable, not in-place
Practical Applications and Variations
Counting sort shines in specific scenarios:
- Radix sort subroutine: Sorting by individual digits (0-9)
- Histogram-based operations: When you need counts anyway
- Bounded categorical data: Status codes, ratings, enum values
For real-world use, you’ll often need to handle negative integers:
def counting_sort_with_negatives(arr: list[int]) -> list[int]:
"""
Counting sort supporting negative integers.
Shifts values to non-negative range internally.
"""
if not arr:
return []
min_val = min(arr)
max_val = max(arr)
range_size = max_val - min_val + 1
# Shift values to start at 0
count = [0] * range_size
for num in arr:
count[num - min_val] += 1
# Cumulative count
for i in range(1, range_size):
count[i] += count[i - 1]
# Build output with reverse shift
output = [0] * len(arr)
for num in reversed(arr):
shifted = num - min_val
count[shifted] -= 1
output[count[shifted]] = num
return output
# Test with negative numbers
data = [3, -1, 0, -5, 2, -1, 4, 0]
print(counting_sort_with_negatives(data))
# Output: [-5, -1, -1, 0, 0, 2, 3, 4]
The trick is simple: find the minimum value and shift all values by subtracting it. This maps your range [min, max] to [0, max - min]. The output values are the originals—we only shift for indexing.
Choosing the Right Tool
Counting sort is a specialized algorithm, not a general-purpose replacement for quicksort. Use it when:
- You’re sorting integers (or data with integer keys)
- The range of values is known and reasonable (k ≈ O(n) or smaller)
- You need stability (especially for radix sort)
- You can afford O(n + k) extra space
Skip it when:
- The value range is huge relative to input size
- You’re sorting floating-point numbers or strings
- Memory is extremely constrained
- You need an in-place algorithm
The algorithm’s simplicity is deceptive. Understanding why it works—the cumulative count transformation, the reverse iteration for stability—makes you a better engineer. These patterns appear throughout computer science, from histogram equalization in image processing to bucket-based algorithms in distributed systems.
When the constraints align, counting sort delivers unbeatable performance. Recognize those situations, and you’ll have a powerful tool that most developers overlook.