Partition Problem: Equal Subset Sum
The partition problem asks a deceptively simple question: given a set of positive integers, can you split them into two subsets such that both subsets have equal sums? Despite its straightforward...
Key Insights
- The partition problem transforms into a subset sum problem: find any subset that sums to exactly half the total, and the remaining elements automatically form the other half.
- A 1D dynamic programming solution achieves O(n × sum) time with O(sum) space, but you must iterate backward through the array to avoid corrupting state mid-computation.
- The algorithm is pseudo-polynomial—practical performance depends on the magnitude of the sum, not just the number of elements, making it unsuitable for arrays with very large values.
Introduction to the Partition Problem
The partition problem asks a deceptively simple question: given a set of positive integers, can you split them into two subsets such that both subsets have equal sums? Despite its straightforward framing, this problem is NP-complete, meaning no known polynomial-time algorithm solves all cases.
You’ll encounter this problem pattern frequently in real systems. Load balancers distribute requests across servers to equalize work. Task schedulers assign jobs to machines to minimize makespan. Resource allocators split memory or compute budgets between processes. Even multiplayer game matchmaking uses partition-like algorithms to create balanced teams.
Understanding the partition problem gives you a template for tackling a broad class of optimization challenges where you need to divide resources fairly.
Problem Analysis and Key Insight
Before writing any code, let’s establish the mathematical foundation that makes this problem tractable.
If we partition array nums into subsets A and B with equal sums, then:
sum(A) = sum(B)sum(A) + sum(B) = total- Therefore:
sum(A) = total / 2
This immediately gives us two critical insights. First, if the total sum is odd, partition is impossible—you can’t split an odd number into two equal integers. Second, the problem reduces to finding any subset that sums to exactly total / 2. If such a subset exists, the remaining elements automatically form the other half.
This reduction transforms a two-subset problem into a single-target subset sum problem, cutting our search space significantly.
def can_partition(nums: list[int]) -> bool:
total = sum(nums)
# Early termination: odd sums can't be partitioned equally
if total % 2 != 0:
return False
target = total // 2
# Additional optimization: if any element exceeds target, impossible
if max(nums) > target:
return False
# If any element equals target exactly, we're done
if max(nums) == target:
return True
# Proceed to subset sum for target
return subset_sum_exists(nums, target)
These early termination checks are cheap and eliminate many impossible cases before we do expensive computation.
Brute Force Approach
The naive approach explores every possible subset combination. For each element, we make a binary choice: include it in subset A or exclude it (implicitly placing it in subset B). This generates 2^n possible subsets.
def can_partition_brute_force(nums: list[int]) -> bool:
total = sum(nums)
if total % 2 != 0:
return False
target = total // 2
def backtrack(index: int, current_sum: int) -> bool:
# Found a valid subset
if current_sum == target:
return True
# Exceeded target or exhausted elements
if current_sum > target or index >= len(nums):
return False
# Choice 1: include current element
if backtrack(index + 1, current_sum + nums[index]):
return True
# Choice 2: exclude current element
return backtrack(index + 1, current_sum)
return backtrack(0, 0)
This solution is correct but impractical. With n=30 elements, we’d explore over a billion combinations. At n=50, we’re looking at quadrillions of operations. The exponential O(2^n) time complexity makes this approach useful only for tiny inputs or as a reference implementation for testing.
The recursive structure does reveal something valuable: overlapping subproblems. Multiple paths through the decision tree can reach the same (index, current_sum) state. This overlap signals that dynamic programming can help.
Dynamic Programming Solution (2D)
The 2D DP approach builds a table where dp[i][j] answers: “Can we achieve sum j using some subset of the first i elements?”
The recurrence relation is straightforward:
dp[i][j] = dp[i-1][j](don’t include element i)dp[i][j] = dp[i-1][j - nums[i]](include element i, if j >= nums[i])
We take the logical OR of both options.
def can_partition_2d_dp(nums: list[int]) -> bool:
total = sum(nums)
if total % 2 != 0:
return False
target = total // 2
n = len(nums)
# dp[i][j] = can we make sum j using first i elements?
dp = [[False] * (target + 1) for _ in range(n + 1)]
# Base case: sum of 0 is always achievable (empty subset)
for i in range(n + 1):
dp[i][0] = True
# Fill the table row by row
for i in range(1, n + 1):
current_num = nums[i - 1] # 0-indexed array, 1-indexed dp
for j in range(1, target + 1):
# Option 1: don't include current number
dp[i][j] = dp[i - 1][j]
# Option 2: include current number (if it fits)
if j >= current_num:
dp[i][j] = dp[i][j] or dp[i - 1][j - current_num]
return dp[n][target]
Let’s trace through a small example. For nums = [1, 5, 11, 5], total = 22, target = 11.
| i\j | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | T | F | F | F | F | F | F | F | F | F | F | F |
| 1 | T | T | F | F | F | F | F | F | F | F | F | F |
| 2 | T | T | F | F | F | T | T | F | F | F | F | F |
| 3 | T | T | F | F | F | T | T | F | F | F | F | T |
| 4 | T | T | F | F | F | T | T | F | F | F | T | T |
The final cell dp[4][11] = True confirms we can partition [1, 5, 11, 5] into two equal subsets (specifically, [1, 5, 5] and [11]).
Time complexity: O(n × target). Space complexity: O(n × target).
Space-Optimized DP (1D Array)
Notice that each row only depends on the previous row. We can collapse the 2D table into a 1D array, but there’s a critical subtlety: we must iterate backward through the sums.
Why backward? When iterating forward, updating dp[j] would use the already-updated dp[j - num] from the current iteration. This incorrectly allows using the same element multiple times. Backward iteration ensures we only reference values from the “previous row.”
def can_partition_optimized(nums: list[int]) -> bool:
total = sum(nums)
if total % 2 != 0:
return False
target = total // 2
# Early termination checks
if max(nums) > target:
return False
# dp[j] = can we achieve sum j with elements seen so far?
dp = [False] * (target + 1)
dp[0] = True # Empty subset sums to 0
for num in nums:
# CRITICAL: iterate backward to avoid using same element twice
# If we went forward, dp[j-num] might already be updated this round
for j in range(target, num - 1, -1):
dp[j] = dp[j] or dp[j - num]
# Early exit: if we can already make target, we're done
if dp[target]:
return True
return dp[target]
The early exit optimization can significantly speed up cases where a valid partition exists early in the array traversal.
Time complexity: O(n × target). Space complexity: O(target).
Edge Cases and Variations
Robust implementations handle these edge cases:
def can_partition_robust(nums: list[int]) -> bool:
# Empty array: technically can partition into two empty sets
if not nums:
return True
# Single element: can't split one item
if len(nums) == 1:
return False
# All zeros: always partitionable
if all(x == 0 for x in nums):
return True
# Standard algorithm for normal cases
total = sum(nums)
if total % 2 != 0:
return False
target = total // 2
dp = [False] * (target + 1)
dp[0] = True
for num in nums:
for j in range(target, num - 1, -1):
dp[j] = dp[j] or dp[j - num]
return dp[target]
Negative numbers fundamentally change the problem. Our DP approach assumes non-negative values; negative numbers require offset adjustments or different algorithms entirely. Most interview and practical scenarios assume positive integers.
Multi-way partition generalizes to k subsets. The 3-partition problem (k=3) is strongly NP-complete, meaning even pseudo-polynomial solutions don’t exist. For k > 2, you typically need approximation algorithms or constraint solvers.
Complexity Analysis and When to Use
The partition problem’s DP solution is pseudo-polynomial—polynomial in the numeric value of the input, not just its size. This distinction matters:
| Input Characteristics | Time Complexity | Practical? |
|---|---|---|
| n=1000, sum=10000 | ~10 million ops | Fast |
| n=100, sum=1000000 | ~100 million ops | Acceptable |
| n=50, sum=10^9 | ~50 billion ops | Too slow |
When the sum is bounded (common in resource allocation where values represent percentages or small units), DP works beautifully. When values can be arbitrarily large (like monetary amounts in cents), you might need approximation algorithms or meet-in-the-middle approaches.
Use the 1D DP solution when:
- Elements are non-negative integers
- The sum is reasonably bounded (under ~10^7)
- You need an exact answer
Consider alternatives when:
- Sum magnitude is huge relative to n (meet-in-the-middle gives O(2^(n/2)))
- Approximate solutions suffice (greedy heuristics)
- You’re solving multi-way partition (constraint programming)
The partition problem is a gateway to understanding subset sum, knapsack, and a family of related optimization problems. Master this pattern, and you’ll recognize it across domains from compiler optimization to financial portfolio balancing.