Subset Sum Problem: DP and Backtracking Solutions

Key Insights

The subset sum problem is NP-complete, but dynamic programming provides a pseudo-polynomial O(n×sum) solution that works well when the target sum is reasonably bounded.
Backtracking with intelligent pruning often outperforms DP in practice when you need the actual subset or when solutions are sparse and can be found early.
Space optimization reduces the 2D DP table to a 1D array, but you must iterate backwards to avoid overwriting values you still need.

Introduction & Problem Definition

The subset sum problem asks a deceptively simple question: given a set of integers and a target sum, does any subset of those integers add up exactly to the target? Despite its straightforward definition, this problem is NP-complete and serves as a foundation for understanding computational complexity.

You’ll encounter subset sum in practical scenarios more often than you might expect. Resource allocation systems need to determine if a combination of tasks fits within a time budget. Financial applications check if a set of transactions can reconcile to a specific balance. Cryptographic systems, particularly knapsack-based encryption schemes, rely on the computational hardness of this problem.

The problem has two variants: the decision version (does a valid subset exist?) and the search version (what is that subset?). We’ll tackle both, since real applications usually need the actual subset, not just a boolean answer.

Naive Recursive Approach

The most intuitive solution explores every possible subset. For each element, you make a binary choice: include it or exclude it. This generates all 2^n possible subsets and checks each one.

def subset_sum_recursive(nums: list[int], target: int) -> bool:
    def helper(index: int, remaining: int) -> bool:
        # Base cases
        if remaining == 0:
            return True
        if index >= len(nums) or remaining < 0:
            return False
        
        # Include current element or exclude it
        include = helper(index + 1, remaining - nums[index])
        exclude = helper(index + 1, remaining)
        
        return include or exclude
    
    return helper(0, target)

This solution has O(2^n) time complexity because each element doubles the number of recursive paths. For n=30, that’s over a billion operations. For n=50, you’re looking at heat death of the universe territory.

The recursion tree reveals massive redundancy. The subproblem “can we make sum 10 using elements from index 5 onwards?” might be computed hundreds of times through different paths. This overlap is exactly what dynamic programming exploits.

Dynamic Programming Solution (Tabulation)

The bottom-up DP approach builds a 2D boolean table where dp[i][j] answers: “Can we achieve sum j using the first i elements?”

The state transition logic is straightforward:

If we could achieve sum j without element i, we can still achieve it (exclude the element)
If we could achieve sum j - nums[i] without element i, we can now achieve sum j (include the element)

def subset_sum_dp_2d(nums: list[int], target: int) -> bool:
    n = len(nums)
    # dp[i][j] = can we make sum j using first i elements?
    dp = [[False] * (target + 1) for _ in range(n + 1)]
    
    # Base case: sum of 0 is always achievable (empty subset)
    for i in range(n + 1):
        dp[i][0] = True
    
    # Fill the table
    for i in range(1, n + 1):
        for j in range(1, target + 1):
            # Exclude current element
            dp[i][j] = dp[i - 1][j]
            
            # Include current element (if it doesn't exceed target)
            if j >= nums[i - 1]:
                dp[i][j] = dp[i][j] or dp[i - 1][j - nums[i - 1]]
    
    return dp[n][target]

This runs in O(n × target) time with O(n × target) space. Notice this is pseudo-polynomial—it’s polynomial in the numeric value of the target, not the input size. If your target is 10^9, this approach becomes impractical.

Since each row only depends on the previous row, we can collapse the 2D table into a 1D array:

def subset_sum_dp_1d(nums: list[int], target: int) -> bool:
    dp = [False] * (target + 1)
    dp[0] = True  # Empty subset sums to 0
    
    for num in nums:
        # Iterate backwards to avoid using updated values
        for j in range(target, num - 1, -1):
            dp[j] = dp[j] or dp[j - num]
    
    return dp[target]

The backwards iteration is critical. If we iterated forwards, dp[j - num] might already reflect the inclusion of the current element, allowing us to use the same element multiple times. This is actually the unbounded knapsack variant—useful for coin change problems, but wrong here.

Memoization Approach (Top-Down DP)

Memoization keeps the recursive structure but caches results. This approach often feels more natural and handles sparse state spaces efficiently—we only compute subproblems actually needed.

def subset_sum_memo(nums: list[int], target: int) -> bool:
    memo = {}
    
    def helper(index: int, remaining: int) -> bool:
        # Check cache first
        if (index, remaining) in memo:
            return memo[(index, remaining)]
        
        # Base cases
        if remaining == 0:
            return True
        if index >= len(nums) or remaining < 0:
            return False
        
        # Recursive exploration with caching
        result = (helper(index + 1, remaining - nums[index]) or 
                  helper(index + 1, remaining))
        
        memo[(index, remaining)] = result
        return result
    
    return helper(0, target)

Memoization and tabulation have the same asymptotic complexity, but their practical performance differs. Memoization avoids computing unreachable states and handles early termination naturally. Tabulation has better cache locality and no recursion overhead. For subset sum specifically, tabulation usually wins because most states are reachable.

Backtracking Solution

When you need the actual subset—not just whether one exists—backtracking becomes attractive. It systematically explores the solution space while pruning branches that can’t lead to valid solutions.

def subset_sum_backtrack(nums: list[int], target: int) -> list[int] | None:
    nums_sorted = sorted(nums, reverse=True)  # Larger elements first for better pruning
    result = []
    
    def backtrack(index: int, remaining: int) -> bool:
        if remaining == 0:
            return True
        if index >= len(nums_sorted) or remaining < 0:
            return False
        
        # Pruning: if smallest remaining element exceeds remaining sum, stop
        # (only valid if all elements are positive)
        if nums_sorted[-1] > remaining:
            return False
        
        for i in range(index, len(nums_sorted)):
            # Skip duplicates to avoid redundant work
            if i > index and nums_sorted[i] == nums_sorted[i - 1]:
                continue
            
            if nums_sorted[i] > remaining:
                continue
            
            result.append(nums_sorted[i])
            if backtrack(i + 1, remaining - nums_sorted[i]):
                return True
            result.pop()  # Backtrack
        
        return False
    
    if backtrack(0, target):
        return result
    return None

Backtracking shines when solutions exist near the beginning of the search space or when aggressive pruning eliminates large subtrees. Sorting elements in descending order helps find solutions faster—larger elements reduce the remaining sum quickly, leading to earlier termination or pruning.

Complexity Analysis & Comparison

Approach	Time Complexity	Space Complexity	Returns Subset?
Naive Recursive	O(2^n)	O(n) stack	Modifiable
DP Tabulation (2D)	O(n × target)	O(n × target)	With backtracing
DP Tabulation (1D)	O(n × target)	O(target)	No
Memoization	O(n × target)	O(n × target)	Modifiable
Backtracking	O(2^n) worst	O(n)	Yes

Choose DP when:

The target sum is bounded and reasonably small
You need guaranteed polynomial-time performance
You’re solving the decision problem or counting variants

Choose backtracking when:

You need the actual subset elements
Solutions are likely to exist and be found early
The input has structure that enables effective pruning

Variations & Extensions

The subset sum framework extends to several related problems. Counting the number of valid subsets requires a simple modification—instead of boolean OR, we sum the counts:

def count_subsets(nums: list[int], target: int) -> int:
    dp = [0] * (target + 1)
    dp[0] = 1  # One way to make sum 0: empty subset
    
    for num in nums:
        for j in range(target, num - 1, -1):
            dp[j] += dp[j - num]
    
    return dp[target]

The partition equal subset sum problem asks whether an array can be split into two subsets with equal sums. This reduces to subset sum with target = total_sum / 2 (after checking that total_sum is even).

Finding the subset with sum closest to a target uses the same DP table but scans for the largest achievable sum not exceeding the target. This variant appears in load balancing and fair division problems.

These problems inherit subset sum’s NP-completeness, meaning no known polynomial-time algorithm exists for the general case. However, the pseudo-polynomial DP solutions remain practical for bounded inputs—which covers most real-world scenarios where you’re dealing with reasonable numeric ranges rather than cryptographic-scale integers.