Moore's Voting Algorithm: Majority Element
The majority element problem asks a deceptively simple question: given an array of n elements, find the element that appears more than n/2 times. If such an element exists, it dominates the array—it...
Key Insights
- Boyer-Moore Voting Algorithm finds the majority element in O(n) time and O(1) space by treating the problem as a “vote cancellation” process rather than counting
- The algorithm works in two phases: candidate selection (which may produce a false positive) and verification (which confirms the result when majority isn’t guaranteed)
- This pattern extends to finding elements appearing more than n/3 times by tracking two candidates simultaneously, making it valuable for distributed systems and stream processing
Introduction & Problem Definition
The majority element problem asks a deceptively simple question: given an array of n elements, find the element that appears more than n/2 times. If such an element exists, it dominates the array—it appears more frequently than all other elements combined.
This problem surfaces constantly in real systems. Election tabulation systems need to determine winners efficiently. Distributed consensus protocols like Raft identify which nodes agree on a value. Stream processing pipelines detect dominant patterns in real-time data. Network routers identify heavy hitters in traffic flows.
The naive instinct is to count everything. But when you’re processing billions of votes, analyzing terabytes of log data, or running on memory-constrained embedded devices, that O(n) space overhead becomes a real problem. We need something cleverer.
Naive Approaches & Their Limitations
Before appreciating Boyer-Moore, let’s establish baselines.
The hash map approach is what most developers reach for first:
from collections import Counter
def majority_element_hashmap(nums: list[int]) -> int:
"""Find majority element using hash map counting.
Time: O(n), Space: O(n)
"""
counts = Counter(nums)
threshold = len(nums) // 2
for num, count in counts.items():
if count > threshold:
return num
return -1 # No majority element
This works. It’s readable. It’s also O(n) space. For an array of 10 billion integers, you might need gigabytes of memory just for the hash map. Worse, in streaming scenarios where you can’t store the entire dataset, this approach fails entirely.
The sorting approach trades time for space:
def majority_element_sorting(nums: list[int]) -> int:
"""Find majority element using sorting.
Time: O(n log n), Space: O(1) or O(n) depending on sort
"""
nums.sort()
return nums[len(nums) // 2] # Majority must be at middle
If a majority element exists, it must occupy the middle position after sorting. Clever, but O(n log n) time complexity hurts at scale, and in-place sorting destroys the original data.
We want O(n) time AND O(1) space. Boyer-Moore delivers exactly that.
Boyer-Moore Voting Algorithm Explained
The algorithm’s brilliance lies in reframing the problem. Instead of counting occurrences, think of it as an election where elements “vote” and opposing votes cancel out.
Here’s the core intuition: if an element appears more than half the time, it will survive any cancellation process. Pair up each majority element with a different element, and you’ll run out of “opponents” before running out of majority elements.
The algorithm maintains just two variables: a candidate and a count. Walk through the array once:
- If count is zero, adopt the current element as the new candidate
- If the current element matches the candidate, increment count
- If it differs, decrement count (a “cancellation”)
After one pass, the candidate is your answer—assuming a majority exists.
def majority_element_boyer_moore(nums: list[int]) -> int:
"""Find majority element using Boyer-Moore Voting Algorithm.
Time: O(n), Space: O(1)
The algorithm treats the problem as vote cancellation:
- Each element "votes" for itself
- Different elements cancel each other out
- The majority element survives all cancellations
"""
candidate = None
count = 0
# Phase 1: Candidate selection
for num in nums:
if count == 0:
# No current candidate; adopt this element
candidate = num
count = 1
elif num == candidate:
# Same as candidate; strengthen position
count += 1
else:
# Different from candidate; cancel one vote
count -= 1
return candidate
When the problem guarantees a majority exists (like LeetCode 169), this single pass suffices. But in real systems, you often can’t make that assumption.
Visual Walkthrough
Let’s trace through the array [2, 2, 1, 1, 1, 2, 2] step by step:
Index | Element | Action | Candidate | Count
------|---------|-------------------------|-----------|------
0 | 2 | count=0, adopt 2 | 2 | 1
1 | 2 | matches, increment | 2 | 2
2 | 1 | differs, decrement | 2 | 1
3 | 1 | differs, decrement | 2 | 0
4 | 1 | count=0, adopt 1 | 1 | 1
5 | 2 | differs, decrement | 1 | 0
6 | 2 | count=0, adopt 2 | 2 | 1
------|---------|-------------------------|-----------|------
Final candidate: 2
Notice how the candidate changes multiple times. At index 4, we temporarily adopt 1 as the candidate. This might seem wrong—1 isn’t the majority element. But that’s fine. The “wrong” candidate gets eliminated when subsequent elements differ from it.
The key insight: the true majority element (2, appearing 4 times out of 7) has enough “votes” to survive all cancellations. Even when we temporarily lose track of it, it resurfaces.
Complexity Analysis & Proof of Correctness
Time complexity: O(n). Single pass through the array, constant work per element.
Space complexity: O(1). Just two variables regardless of input size.
Why verification matters: The algorithm above assumes a majority exists. Without that guarantee, the candidate might be a false positive. Consider [1, 2, 3]—the algorithm returns 3, but no majority exists.
def majority_element_verified(nums: list[int]) -> int | None:
"""Boyer-Moore with verification phase.
Use when majority element existence isn't guaranteed.
"""
# Phase 1: Find candidate
candidate = None
count = 0
for num in nums:
if count == 0:
candidate = num
count = 1
elif num == candidate:
count += 1
else:
count -= 1
# Phase 2: Verify candidate
actual_count = sum(1 for num in nums if num == candidate)
if actual_count > len(nums) // 2:
return candidate
return None
Correctness argument: Suppose element m appears more than n/2 times. Consider what happens during the algorithm:
- Every time we see
mand the candidate ism, count increases - Every time we see
mand the candidate differs, we “waste” one occurrence ofmto cancel - Every time we see a non-
melement, it either increases count for a wrong candidate (which will be canceled later) or decreases count
Since m appears more than n/2 times, there are more occurrences of m than all other elements combined. After all cancellations, m must be the surviving candidate.
Variations & Extensions
The algorithm extends elegantly to finding elements appearing more than n/3 times. Since at most two elements can exceed this threshold, we track two candidates:
def majority_elements_n3(nums: list[int]) -> list[int]:
"""Find all elements appearing more than n/3 times.
Uses extended Boyer-Moore with two candidates.
Time: O(n), Space: O(1)
At most 2 elements can appear > n/3 times in any array.
"""
candidate1, candidate2 = None, None
count1, count2 = 0, 0
# Phase 1: Find up to two candidates
for num in nums:
if candidate1 == num:
count1 += 1
elif candidate2 == num:
count2 += 1
elif count1 == 0:
candidate1, count1 = num, 1
elif count2 == 0:
candidate2, count2 = num, 1
else:
# Cancel one vote from each candidate
count1 -= 1
count2 -= 1
# Phase 2: Verify candidates
threshold = len(nums) // 3
result = []
for candidate in [candidate1, candidate2]:
if candidate is not None:
if sum(1 for num in nums if num == candidate) > threshold:
result.append(candidate)
return result
The order of conditions matters here. We must check if the current element matches either candidate before checking if counts are zero. Otherwise, we might overwrite a valid candidate.
Stream processing applications: Boyer-Moore shines in streaming contexts. You can process elements one at a time, maintaining constant memory. This makes it ideal for:
- Real-time log analysis identifying dominant error codes
- Network monitoring detecting traffic anomalies
- Sensor data processing finding predominant readings
For approximate heavy-hitter detection in streams, combine Boyer-Moore with Count-Min Sketch for probabilistic guarantees on elements exceeding any threshold.
Practice Problems & Conclusion
Test your understanding with these problems:
- LeetCode 169 - Majority Element: Direct application, majority guaranteed to exist
- LeetCode 229 - Majority Element II: The n/3 variant requiring two candidates
- LeetCode 1150 - Check If a Number Is Majority Element in a Sorted Array: Combine with binary search
When to use Boyer-Moore:
- Memory constraints prevent hash map usage
- Processing streams where you can’t store all data
- The threshold is n/k for small k (track k-1 candidates)
When to use alternatives:
- You need exact counts of all elements (hash map)
- The threshold is arbitrary (Count-Min Sketch)
- Data is already sorted (binary search)
Boyer-Moore represents a broader pattern in algorithm design: sometimes the clever solution isn’t about building complex data structures—it’s about recognizing mathematical properties that let you discard information safely. The majority element survives cancellation not because we track it carefully, but because it’s mathematically impossible for it to be fully canceled.
This “cancellation” intuition appears elsewhere: in finding single numbers among pairs (XOR cancellation), in stream algorithms (symmetric differences), and in distributed computing (Byzantine fault tolerance thresholds). Master the pattern, and you’ll recognize opportunities to optimize space in unexpected places.