Quick Sort: Partition-Based Sorting Algorithm

Key Insights

Quick sort’s performance hinges entirely on pivot selection—a poor choice degrades O(n log n) to O(n²), making randomized or median-of-three strategies essential for production code.
The partition operation is the algorithm’s workhorse; understanding Lomuto vs. Hoare schemes helps you choose the right variant for your data characteristics.
Modern implementations hybrid quick sort with insertion sort for small sub-arrays and use three-way partitioning for duplicate-heavy data, making the textbook version rarely optimal.

Introduction to Quick Sort

Quick sort stands as one of the most widely used sorting algorithms in practice, and for good reason. Despite sharing the same O(n log n) average time complexity as merge sort, quick sort typically outperforms it due to better cache locality and lower constant factors. The algorithm sorts in-place, requiring only O(log n) auxiliary space for the recursion stack, compared to merge sort’s O(n) space requirement.

The algorithm follows a divide-and-conquer strategy, but with a twist. Where merge sort does the heavy lifting during the merge phase, quick sort front-loads its work in the partition phase. You pick a pivot element, rearrange the array so everything smaller sits to the left and everything larger sits to the right, then recursively apply the same logic to both halves.

This simplicity masks considerable depth. The devil lives in the details—particularly in how you choose your pivot and handle edge cases.

The Partition Operation

Partitioning is the core operation that makes quick sort work. Given an array and a pivot value, partitioning rearranges elements so that all values less than or equal to the pivot precede all values greater than the pivot. The pivot ends up in its final sorted position.

The Lomuto partition scheme, while not the most efficient, offers clarity that makes it excellent for learning:

def lomuto_partition(arr: list, low: int, high: int) -> int:
    """
    Partition array using Lomuto scheme with last element as pivot.
    Returns the final position of the pivot.
    """
    pivot = arr[high]
    i = low - 1  # Index of smaller element boundary
    
    for j in range(low, high):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]
    
    # Place pivot in its correct position
    arr[i + 1], arr[high] = arr[high], arr[i + 1]
    return i + 1

The variable i tracks the boundary between elements smaller than or equal to the pivot and those we haven’t yet confirmed as smaller. As we scan through with j, whenever we find an element smaller than the pivot, we expand the “small” region by incrementing i and swapping. After the loop, we swap the pivot into position i + 1, which becomes its final sorted location.

This scheme performs roughly n comparisons and up to n swaps per partition. We’ll see later how Hoare’s original scheme reduces swaps significantly.

The Recursive Quick Sort Algorithm

With partitioning understood, the full algorithm becomes straightforward:

def quick_sort(arr: list, low: int = 0, high: int = None) -> None:
    """
    Sort array in-place using quick sort.
    """
    if high is None:
        high = len(arr) - 1
    
    # Base case: sub-array has 0 or 1 elements
    if low >= high:
        return
    
    # Partition and get pivot's final position
    pivot_index = lomuto_partition(arr, low, high)
    
    # Recursively sort elements before and after partition
    quick_sort(arr, low, pivot_index - 1)
    quick_sort(arr, pivot_index + 1, high)


# Usage
data = [64, 34, 25, 12, 22, 11, 90]
quick_sort(data)
print(data)  # [11, 12, 22, 25, 34, 64, 90]

The recursion terminates when a sub-array contains zero or one elements—both are trivially sorted. Each recursive call handles a strictly smaller portion of the array, guaranteeing termination.

Notice that unlike merge sort, we don’t need to merge results. Once both recursive calls complete, the array is sorted because each partition operation places the pivot in its final position, and recursion ensures all other elements reach their correct positions too.

Pivot Selection Strategies

Pivot selection dramatically impacts performance. The ideal pivot divides the array into two equal halves, yielding O(n log n) performance. The worst pivot creates a maximally unbalanced partition—one side with n-1 elements and the other empty—degrading to O(n²).

First or last element: Simple but dangerous. Already-sorted or reverse-sorted input triggers worst-case behavior, which happens more often than you’d expect in real data.

Random pivot: Statistically eliminates worst-case patterns. An adversary cannot construct input that reliably triggers poor performance.

Median-of-three: Select the median of the first, middle, and last elements. This handles sorted input well and approximates the true median cheaply:

def median_of_three(arr: list, low: int, high: int) -> int:
    """
    Return index of median among first, middle, and last elements.
    """
    mid = (low + high) // 2
    
    # Sort the three candidates
    if arr[low] > arr[mid]:
        arr[low], arr[mid] = arr[mid], arr[low]
    if arr[low] > arr[high]:
        arr[low], arr[high] = arr[high], arr[low]
    if arr[mid] > arr[high]:
        arr[mid], arr[high] = arr[high], arr[mid]
    
    # Move median to high-1 position for Lomuto partition
    arr[mid], arr[high - 1] = arr[high - 1], arr[mid]
    return high - 1


def quick_sort_median(arr: list, low: int = 0, high: int = None) -> None:
    """
    Quick sort with median-of-three pivot selection.
    """
    if high is None:
        high = len(arr) - 1
    
    if high - low < 2:  # Handle small sub-arrays directly
        if high > low and arr[low] > arr[high]:
            arr[low], arr[high] = arr[high], arr[low]
        return
    
    pivot_idx = median_of_three(arr, low, high)
    pivot_idx = lomuto_partition(arr, low, pivot_idx)
    
    quick_sort_median(arr, low, pivot_idx - 1)
    quick_sort_median(arr, pivot_idx + 1, high)

For production code, median-of-three or randomized selection should be your default. The performance cost is negligible, and the protection against adversarial or pathological input is significant.

Performance Analysis

Best case O(n log n): Occurs when each partition divides the array into two roughly equal halves. We perform n comparisons per level, and with balanced partitions, we have log n levels.

Average case O(n log n): Even with imperfect partitions, as long as we achieve some constant fraction split (say 90/10), we maintain O(n log n). The constant factor increases, but the complexity class holds.

Worst case O(n²): Happens when partitions are maximally unbalanced—pivot is always the smallest or largest element. We perform n-1 partitions, each scanning nearly the entire remaining array.

Space complexity deserves attention. Quick sort is often called “in-place,” but the recursion stack consumes O(log n) space in the average case. In the worst case, the stack depth reaches O(n), which can cause stack overflow on large, pathological inputs. This is why production implementations use tail-call optimization or convert to iterative form for one of the recursive calls.

def quick_sort_iterative(arr: list) -> None:
    """
    Quick sort with explicit stack to avoid recursion depth issues.
    """
    stack = [(0, len(arr) - 1)]
    
    while stack:
        low, high = stack.pop()
        
        if low >= high:
            continue
        
        pivot_idx = lomuto_partition(arr, low, high)
        
        # Push larger partition first (processed last)
        # This ensures stack depth stays O(log n)
        if pivot_idx - low < high - pivot_idx:
            stack.append((pivot_idx + 1, high))
            stack.append((low, pivot_idx - 1))
        else:
            stack.append((low, pivot_idx - 1))
            stack.append((pivot_idx + 1, high))

Optimizations and Variants

Real-world quick sort implementations incorporate several optimizations.

Insertion sort cutoff: For small sub-arrays (typically 10-20 elements), insertion sort outperforms quick sort due to lower overhead. The recursion and partitioning machinery costs more than insertion sort’s O(n²) when n is tiny.

Hoare partition scheme: Tony Hoare’s original partition uses two pointers moving toward each other, typically performing fewer swaps than Lomuto’s scheme. It’s trickier to implement correctly but measurably faster.

Three-way partitioning: When arrays contain many duplicate keys, standard quick sort wastes time re-partitioning equal elements. Three-way partitioning groups elements into less-than, equal-to, and greater-than regions:

def three_way_partition(arr: list, low: int, high: int) -> tuple:
    """
    Dutch National Flag partition for handling duplicates.
    Returns (lt, gt) where:
    - arr[low:lt] contains elements < pivot
    - arr[lt:gt+1] contains elements == pivot
    - arr[gt+1:high+1] contains elements > pivot
    """
    pivot = arr[low]
    lt = low      # arr[low:lt] < pivot
    i = low + 1   # arr[lt:i] == pivot
    gt = high     # arr[gt+1:high+1] > pivot
    
    while i <= gt:
        if arr[i] < pivot:
            arr[lt], arr[i] = arr[i], arr[lt]
            lt += 1
            i += 1
        elif arr[i] > pivot:
            arr[i], arr[gt] = arr[gt], arr[i]
            gt -= 1
        else:
            i += 1
    
    return lt, gt


def quick_sort_three_way(arr: list, low: int = 0, high: int = None) -> None:
    """
    Quick sort optimized for arrays with many duplicate keys.
    """
    if high is None:
        high = len(arr) - 1
    
    if low >= high:
        return
    
    lt, gt = three_way_partition(arr, low, high)
    
    quick_sort_three_way(arr, low, lt - 1)
    quick_sort_three_way(arr, gt + 1, high)

This optimization transforms quick sort’s worst case on all-equal arrays from O(n²) to O(n).

When to Use Quick Sort

Choose quick sort when you need fast average-case performance with minimal memory overhead. It excels when:

Memory is constrained (in-place sorting matters)
Cache performance is critical (sequential access patterns)
You can use randomized or median-of-three pivot selection
Data isn’t already nearly sorted

Choose merge sort instead when:

You need guaranteed O(n log n) worst-case performance
You’re sorting linked lists (quick sort’s random access hurts here)
Stability matters (equal elements must maintain relative order)

Most standard library sort implementations use hybrid approaches. Python’s Timsort combines merge sort with insertion sort. C++’s std::sort typically uses Introsort—quick sort that switches to heap sort if recursion depth exceeds a threshold, guaranteeing O(n log n) worst case while preserving quick sort’s average-case speed.

Quick sort’s elegance lies in its simplicity and adaptability. Master the basic algorithm, understand the pitfalls, apply the appropriate optimizations, and you’ll have a sorting tool that handles the vast majority of real-world scenarios efficiently.