Moore's Voting Algorithm: Majority Element

The majority element problem asks a deceptively simple question: given an array of n elements, find the element that appears more than n/2 times. If such an element exists, it dominates the array—it...

Key Insights

  • Boyer-Moore Voting Algorithm finds the majority element in O(n) time and O(1) space by treating the problem as a “vote cancellation” process rather than counting
  • The algorithm works in two phases: candidate selection (which may produce a false positive) and verification (which confirms the result when majority isn’t guaranteed)
  • This pattern extends to finding elements appearing more than n/3 times by tracking two candidates simultaneously, making it valuable for distributed systems and stream processing

Introduction & Problem Definition

The majority element problem asks a deceptively simple question: given an array of n elements, find the element that appears more than n/2 times. If such an element exists, it dominates the array—it appears more frequently than all other elements combined.

This problem surfaces constantly in real systems. Election tabulation systems need to determine winners efficiently. Distributed consensus protocols like Raft identify which nodes agree on a value. Stream processing pipelines detect dominant patterns in real-time data. Network routers identify heavy hitters in traffic flows.

The naive instinct is to count everything. But when you’re processing billions of votes, analyzing terabytes of log data, or running on memory-constrained embedded devices, that O(n) space overhead becomes a real problem. We need something cleverer.

Naive Approaches & Their Limitations

Before appreciating Boyer-Moore, let’s establish baselines.

The hash map approach is what most developers reach for first:

from collections import Counter

def majority_element_hashmap(nums: list[int]) -> int:
    """Find majority element using hash map counting.
    
    Time: O(n), Space: O(n)
    """
    counts = Counter(nums)
    threshold = len(nums) // 2
    
    for num, count in counts.items():
        if count > threshold:
            return num
    
    return -1  # No majority element

This works. It’s readable. It’s also O(n) space. For an array of 10 billion integers, you might need gigabytes of memory just for the hash map. Worse, in streaming scenarios where you can’t store the entire dataset, this approach fails entirely.

The sorting approach trades time for space:

def majority_element_sorting(nums: list[int]) -> int:
    """Find majority element using sorting.
    
    Time: O(n log n), Space: O(1) or O(n) depending on sort
    """
    nums.sort()
    return nums[len(nums) // 2]  # Majority must be at middle

If a majority element exists, it must occupy the middle position after sorting. Clever, but O(n log n) time complexity hurts at scale, and in-place sorting destroys the original data.

We want O(n) time AND O(1) space. Boyer-Moore delivers exactly that.

Boyer-Moore Voting Algorithm Explained

The algorithm’s brilliance lies in reframing the problem. Instead of counting occurrences, think of it as an election where elements “vote” and opposing votes cancel out.

Here’s the core intuition: if an element appears more than half the time, it will survive any cancellation process. Pair up each majority element with a different element, and you’ll run out of “opponents” before running out of majority elements.

The algorithm maintains just two variables: a candidate and a count. Walk through the array once:

  1. If count is zero, adopt the current element as the new candidate
  2. If the current element matches the candidate, increment count
  3. If it differs, decrement count (a “cancellation”)

After one pass, the candidate is your answer—assuming a majority exists.

def majority_element_boyer_moore(nums: list[int]) -> int:
    """Find majority element using Boyer-Moore Voting Algorithm.
    
    Time: O(n), Space: O(1)
    
    The algorithm treats the problem as vote cancellation:
    - Each element "votes" for itself
    - Different elements cancel each other out
    - The majority element survives all cancellations
    """
    candidate = None
    count = 0
    
    # Phase 1: Candidate selection
    for num in nums:
        if count == 0:
            # No current candidate; adopt this element
            candidate = num
            count = 1
        elif num == candidate:
            # Same as candidate; strengthen position
            count += 1
        else:
            # Different from candidate; cancel one vote
            count -= 1
    
    return candidate

When the problem guarantees a majority exists (like LeetCode 169), this single pass suffices. But in real systems, you often can’t make that assumption.

Visual Walkthrough

Let’s trace through the array [2, 2, 1, 1, 1, 2, 2] step by step:

Index | Element | Action                  | Candidate | Count
------|---------|-------------------------|-----------|------
  0   |    2    | count=0, adopt 2        |     2     |   1
  1   |    2    | matches, increment      |     2     |   2
  2   |    1    | differs, decrement      |     2     |   1
  3   |    1    | differs, decrement      |     2     |   0
  4   |    1    | count=0, adopt 1        |     1     |   1
  5   |    2    | differs, decrement      |     1     |   0
  6   |    2    | count=0, adopt 2        |     2     |   1
------|---------|-------------------------|-----------|------
                  Final candidate: 2

Notice how the candidate changes multiple times. At index 4, we temporarily adopt 1 as the candidate. This might seem wrong—1 isn’t the majority element. But that’s fine. The “wrong” candidate gets eliminated when subsequent elements differ from it.

The key insight: the true majority element (2, appearing 4 times out of 7) has enough “votes” to survive all cancellations. Even when we temporarily lose track of it, it resurfaces.

Complexity Analysis & Proof of Correctness

Time complexity: O(n). Single pass through the array, constant work per element.

Space complexity: O(1). Just two variables regardless of input size.

Why verification matters: The algorithm above assumes a majority exists. Without that guarantee, the candidate might be a false positive. Consider [1, 2, 3]—the algorithm returns 3, but no majority exists.

def majority_element_verified(nums: list[int]) -> int | None:
    """Boyer-Moore with verification phase.
    
    Use when majority element existence isn't guaranteed.
    """
    # Phase 1: Find candidate
    candidate = None
    count = 0
    
    for num in nums:
        if count == 0:
            candidate = num
            count = 1
        elif num == candidate:
            count += 1
        else:
            count -= 1
    
    # Phase 2: Verify candidate
    actual_count = sum(1 for num in nums if num == candidate)
    
    if actual_count > len(nums) // 2:
        return candidate
    return None

Correctness argument: Suppose element m appears more than n/2 times. Consider what happens during the algorithm:

  • Every time we see m and the candidate is m, count increases
  • Every time we see m and the candidate differs, we “waste” one occurrence of m to cancel
  • Every time we see a non-m element, it either increases count for a wrong candidate (which will be canceled later) or decreases count

Since m appears more than n/2 times, there are more occurrences of m than all other elements combined. After all cancellations, m must be the surviving candidate.

Variations & Extensions

The algorithm extends elegantly to finding elements appearing more than n/3 times. Since at most two elements can exceed this threshold, we track two candidates:

def majority_elements_n3(nums: list[int]) -> list[int]:
    """Find all elements appearing more than n/3 times.
    
    Uses extended Boyer-Moore with two candidates.
    Time: O(n), Space: O(1)
    
    At most 2 elements can appear > n/3 times in any array.
    """
    candidate1, candidate2 = None, None
    count1, count2 = 0, 0
    
    # Phase 1: Find up to two candidates
    for num in nums:
        if candidate1 == num:
            count1 += 1
        elif candidate2 == num:
            count2 += 1
        elif count1 == 0:
            candidate1, count1 = num, 1
        elif count2 == 0:
            candidate2, count2 = num, 1
        else:
            # Cancel one vote from each candidate
            count1 -= 1
            count2 -= 1
    
    # Phase 2: Verify candidates
    threshold = len(nums) // 3
    result = []
    
    for candidate in [candidate1, candidate2]:
        if candidate is not None:
            if sum(1 for num in nums if num == candidate) > threshold:
                result.append(candidate)
    
    return result

The order of conditions matters here. We must check if the current element matches either candidate before checking if counts are zero. Otherwise, we might overwrite a valid candidate.

Stream processing applications: Boyer-Moore shines in streaming contexts. You can process elements one at a time, maintaining constant memory. This makes it ideal for:

  • Real-time log analysis identifying dominant error codes
  • Network monitoring detecting traffic anomalies
  • Sensor data processing finding predominant readings

For approximate heavy-hitter detection in streams, combine Boyer-Moore with Count-Min Sketch for probabilistic guarantees on elements exceeding any threshold.

Practice Problems & Conclusion

Test your understanding with these problems:

  • LeetCode 169 - Majority Element: Direct application, majority guaranteed to exist
  • LeetCode 229 - Majority Element II: The n/3 variant requiring two candidates
  • LeetCode 1150 - Check If a Number Is Majority Element in a Sorted Array: Combine with binary search

When to use Boyer-Moore:

  • Memory constraints prevent hash map usage
  • Processing streams where you can’t store all data
  • The threshold is n/k for small k (track k-1 candidates)

When to use alternatives:

  • You need exact counts of all elements (hash map)
  • The threshold is arbitrary (Count-Min Sketch)
  • Data is already sorted (binary search)

Boyer-Moore represents a broader pattern in algorithm design: sometimes the clever solution isn’t about building complex data structures—it’s about recognizing mathematical properties that let you discard information safely. The majority element survives cancellation not because we track it carefully, but because it’s mathematically impossible for it to be fully canceled.

This “cancellation” intuition appears elsewhere: in finding single numbers among pairs (XOR cancellation), in stream algorithms (symmetric differences), and in distributed computing (Byzantine fault tolerance thresholds). Master the pattern, and you’ll recognize opportunities to optimize space in unexpected places.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.