Distinct Subsequences: Counting Subsequences DP
The Distinct Subsequences problem (LeetCode 115) asks a deceptively simple question: given a source string `s` and a target string `t`, count how many distinct subsequences of `s` equal `t`.
Key Insights
- The distinct subsequences problem reveals a fundamental DP pattern: at each position, you either match and advance both pointers, or skip and advance only the source pointer—the total count is the sum of both choices.
- Space optimization from O(m×n) to O(n) requires backward iteration to prevent overwriting values you still need, a technique applicable to many 2D DP problems.
- This counting pattern appears frequently in bioinformatics, text matching, and combinatorics problems—mastering it unlocks solutions to an entire family of related challenges.
Introduction to the Problem
The Distinct Subsequences problem (LeetCode 115) asks a deceptively simple question: given a source string s and a target string t, count how many distinct subsequences of s equal t.
A subsequence differs from a substring in a critical way. A substring must be contiguous characters, while a subsequence can skip characters but must maintain relative order. For “abc”, valid subsequences include “a”, “ac”, “abc”, but not “ca” (wrong order) or “d” (not present).
This problem appears in practical contexts more often than you might expect. Bioinformatics uses similar algorithms for DNA sequence alignment. Text editors leverage these patterns for fuzzy matching. Plagiarism detection systems count common subsequences to identify copied content.
Let’s start with the naive approach to understand why we need dynamic programming:
def num_distinct_brute_force(s: str, t: str) -> int:
def recurse(i: int, j: int) -> int:
# Base case: matched all of t
if j == len(t):
return 1
# Base case: exhausted s without matching t
if i == len(s):
return 0
# Always try skipping current character in s
count = recurse(i + 1, j)
# If characters match, also try using this character
if s[i] == t[j]:
count += recurse(i + 1, j + 1)
return count
return recurse(0, 0)
This solution has O(2^m) time complexity where m is the length of s. Each character in s presents a binary choice: use it or skip it. For large inputs, this explodes exponentially.
Building Intuition with Small Examples
Consider the classic example: counting subsequences of “rabbbit” that equal “rabbit”. The answer is 3, but why?
The source has three ‘b’ characters at positions 2, 3, and 4. To form “rabbit”, we need exactly two ‘b’ characters. We can choose:
- Positions 2 and 3 (skip position 4)
- Positions 2 and 4 (skip position 3)
- Positions 3 and 4 (skip position 2)
Each selection produces a valid subsequence. The other characters have only one option each, so three total subsequences exist.
Here’s a helper to visualize all subsequences for debugging small inputs:
def enumerate_subsequences(s: str, t: str) -> list[str]:
"""Returns all subsequences of s that equal t, showing selected indices."""
results = []
def backtrack(i: int, j: int, path: list[int]):
if j == len(t):
results.append(f"indices: {path} -> {''.join(s[idx] for idx in path)}")
return
if i == len(s):
return
# Skip current character
backtrack(i + 1, j, path)
# Use current character if it matches
if s[i] == t[j]:
backtrack(i + 1, j + 1, path + [i])
backtrack(0, 0, [])
return results
# Example usage
for seq in enumerate_subsequences("rabbbit", "rabbit"):
print(seq)
# Output:
# indices: [0, 1, 2, 3, 4, 5] -> rabbit
# indices: [0, 1, 2, 4, 5, 6] -> rabbit
# indices: [0, 1, 3, 4, 5, 6] -> rabbit
Notice how the recursive calls overlap. When we’re at position i=3 in s and j=2 in t, we compute the same subproblem regardless of how we arrived there. This overlapping substructure screams for memoization.
Recursive Solution with Memoization (Top-Down DP)
The recurrence relation captures our two choices at each step:
dp(i, j) = dp(i+1, j) # skip s[i]
+ dp(i+1, j+1) if s[i] == t[j] # use s[i] to match t[j]
Base cases:
j == len(t): We’ve matched all of t, return 1i == len(s): We’ve exhausted s without matching t, return 0
The state space is O(m × n) since i ranges over s and j ranges over t. Each state computes in O(1), giving us O(m × n) time complexity.
from functools import lru_cache
def num_distinct_memo(s: str, t: str) -> int:
@lru_cache(maxsize=None)
def dp(i: int, j: int) -> int:
# Successfully matched all of t
if j == len(t):
return 1
# Ran out of characters in s
if i == len(s):
return 0
# Remaining s is shorter than remaining t - impossible to match
if len(s) - i < len(t) - j:
return 0
# Count ways by skipping s[i]
result = dp(i + 1, j)
# Add ways by matching s[i] with t[j]
if s[i] == t[j]:
result += dp(i + 1, j + 1)
return result
return dp(0, 0)
The pruning condition len(s) - i < len(t) - j prevents exploring impossible states where remaining source characters can’t possibly match remaining target characters.
Tabulation Approach (Bottom-Up DP)
Converting to bottom-up iteration eliminates recursion overhead and makes the algorithm’s structure explicit. We build a 2D table where dp[i][j] represents the number of ways to match t[j:] using s[i:].
def num_distinct_tabulation(s: str, t: str) -> int:
m, n = len(s), len(t)
# dp[i][j] = ways to form t[j:] from s[i:]
# Extra row/column for base cases
dp = [[0] * (n + 1) for _ in range(m + 1)]
# Base case: empty t can be formed from any suffix of s (one way: select nothing)
for i in range(m + 1):
dp[i][n] = 1
# Fill table from bottom-right to top-left
for i in range(m - 1, -1, -1):
for j in range(n - 1, -1, -1):
# Always can skip s[i]
dp[i][j] = dp[i + 1][j]
# If characters match, add ways from matching
if s[i] == t[j]:
dp[i][j] += dp[i + 1][j + 1]
return dp[0][0]
The table fills right-to-left, bottom-to-top because each cell depends on cells below and to the right. Visualizing for s=“abb”, t=“ab”:
a b ""
+---+---+---+
a | 2 | 1 | 1 | dp[0][0]=2: two ways to form "ab" from "abb"
+---+---+---+
b | 0 | 1 | 1 | dp[1][1]=1: one way to form "b" from "bb"
+---+---+---+
b | 0 | 1 | 1 | dp[2][1]=1: one way to form "b" from "b"
+---+---+---+
""| 0 | 0 | 1 | base case row
+---+---+---+
Space Optimization Techniques
Observe that each row only depends on the row below it. We can collapse the 2D table into a single 1D array, reducing space from O(m × n) to O(n).
The critical insight: we must iterate j backwards. If we went forwards, we’d overwrite dp[j+1] before using it for dp[j].
def num_distinct_optimized(s: str, t: str) -> int:
m, n = len(s), len(t)
# dp[j] = ways to form t[j:] from current suffix of s
dp = [0] * (n + 1)
dp[n] = 1 # Base case: one way to form empty string
# Process each character of s from end to start
for i in range(m - 1, -1, -1):
# Process t from start to end (backwards in terms of dependency)
# We go left-to-right but save previous value
for j in range(n):
if s[i] == t[j]:
dp[j] += dp[j + 1]
# Note: dp[n] stays 1 throughout
return dp[0]
Wait—that iteration order looks wrong. Let me fix it with the correct backward iteration:
def num_distinct_optimized(s: str, t: str) -> int:
n = len(t)
# dp[j] = ways to form t[j:] from current suffix of s
dp = [0] * (n + 1)
dp[n] = 1
for char in s:
# Iterate backwards to avoid using updated values
for j in range(n - 1, -1, -1):
if char == t[j]:
dp[j] += dp[j + 1]
return dp[0]
Actually, the direction depends on how you frame the problem. Here’s the clearest version:
def num_distinct_space_optimized(s: str, t: str) -> int:
n = len(t)
dp = [0] * (n + 1)
dp[0] = 1 # One way to form empty prefix of t
for char in s:
# Must go backwards: dp[j] depends on dp[j-1] from BEFORE this iteration
for j in range(n, 0, -1):
if char == t[j - 1]:
dp[j] += dp[j - 1]
return dp[n]
Edge Cases and Variations
Production code must handle edge cases gracefully:
def num_distinct_production(s: str, t: str, mod: int = None) -> int:
"""
Count distinct subsequences of s that equal t.
Args:
s: Source string
t: Target string
mod: Optional modulus for large counts (e.g., 10**9 + 7)
Returns:
Count of distinct subsequences, optionally modulo mod
"""
# Edge case: empty target matches any source exactly once
if not t:
return 1
# Edge case: empty source can't match non-empty target
if not s:
return 0
# Edge case: source shorter than target - impossible
if len(s) < len(t):
return 0
n = len(t)
dp = [0] * (n + 1)
dp[0] = 1
for char in s:
for j in range(n, 0, -1):
if char == t[j - 1]:
dp[j] += dp[j - 1]
if mod:
dp[j] %= mod
return dp[n]
Related problems worth practicing:
- Edit Distance (LeetCode 72): Similar 2D DP structure with different transitions
- Longest Common Subsequence (LeetCode 1143): Maximize instead of count
- Number of Ways to Form Target (LeetCode 1639): Counting with character frequency constraints
Complexity Summary and Key Takeaways
| Approach | Time | Space | Readability |
|---|---|---|---|
| Brute Force | O(2^m) | O(m) | High |
| Memoization | O(m×n) | O(m×n) | High |
| Tabulation | O(m×n) | O(m×n) | Medium |
| Space-Optimized | O(m×n) | O(n) | Lower |
Choose memoization for interviews—it’s easier to derive correctly. Use space-optimized tabulation in production when memory matters.
The counting subsequences pattern applies whenever you need to count ways to select elements maintaining order. Key signals: “how many ways”, “subsequence”, “maintain order”. When you see these, consider whether the choice at each position is binary (use/skip) and whether subproblems overlap.
Practice these related problems to cement the pattern: Decode Ways, Unique Paths, and Interleaving String all use similar state transition logic with different recurrence relations.