Code Coverage: Line, Branch, and Path Coverage
Code coverage measures how much of your source code executes during testing. It's one of the few objective metrics we have for test quality, but it's frequently misunderstood and misused.
Key Insights
- Line coverage tells you what code ran, branch coverage tells you what decisions were tested, and path coverage tells you what combinations were verified—each catches different categories of bugs.
- 100% line coverage can still leave 50% or more of your branches untested; always measure branch coverage for code with conditional logic.
- Path coverage is theoretically ideal but practically impossible for most real code—use it selectively for critical algorithms, not as a blanket requirement.
Introduction to Code Coverage
Code coverage measures how much of your source code executes during testing. It’s one of the few objective metrics we have for test quality, but it’s frequently misunderstood and misused.
Here’s the uncomfortable truth: high coverage doesn’t mean your code works correctly. It means your tests touched that code. A test that executes a function but never checks its output still contributes to coverage. Coverage tells you what you haven’t tested—it can’t tell you if what you have tested is meaningful.
That said, coverage remains valuable. Low coverage is a reliable signal of undertested code. And understanding the types of coverage—line, branch, and path—helps you write tests that actually find bugs rather than just inflate metrics.
Line Coverage (Statement Coverage)
Line coverage, also called statement coverage, answers a simple question: did this line of code execute during testing?
def calculate_discount(price, is_member):
discount = 0 # Line 1
if is_member: # Line 2
discount = price * 0.1 # Line 3
final_price = price - discount # Line 4
return final_price # Line 5
If you write a single test:
def test_member_discount():
result = calculate_discount(100, True)
assert result == 90
You achieve 100% line coverage. Every line executes. But you’ve never tested what happens when is_member is False. The function might have a bug in that scenario, and your coverage report would show green across the board.
Line coverage is the most common metric because it’s easy to compute and understand. Most coverage tools default to it. But it’s also the weakest form of coverage—it treats code as a sequence of statements rather than a graph of decisions.
Use line coverage as a baseline. If you’re below 60-70%, you have significant gaps. But don’t mistake high line coverage for thorough testing.
Branch Coverage (Decision Coverage)
Branch coverage measures whether each decision point in your code has been evaluated to both true and false. Every if, while, for, and ternary operator creates branches.
def validate_order(quantity, has_stock, is_premium):
if quantity <= 0:
return "Invalid quantity"
if has_stock or is_premium:
return "Order accepted"
return "Out of stock"
This function has four branches:
quantity <= 0is truequantity <= 0is falsehas_stock or is_premiumis truehas_stock or is_premiumis false
Here’s where it gets tricky. Consider this test suite:
def test_invalid_quantity():
assert validate_order(0, True, False) == "Invalid quantity"
def test_order_accepted():
assert validate_order(5, True, False) == "Order accepted"
def test_out_of_stock():
assert validate_order(5, False, False) == "Out of stock"
You have 100% line coverage and 100% branch coverage. But you’ve never tested the case where has_stock is False and is_premium is True. The compound condition has_stock or is_premium evaluated to true only because has_stock was true.
This reveals a subtle distinction. Basic branch coverage (also called decision coverage) checks that each decision evaluates to both outcomes. But it doesn’t require testing every combination within compound conditions. For that, you need condition coverage or Modified Condition/Decision Coverage (MC/DC), which is required in safety-critical systems like aviation software.
For most applications, branch coverage is the sweet spot. It catches significantly more bugs than line coverage without the exponential test requirements of full path coverage.
Path Coverage
Path coverage is the most rigorous metric: it requires testing every possible execution path through a function. A path is a unique sequence of branches from entry to exit.
def process_payment(amount, has_coupon, is_member, use_points):
total = amount
if has_coupon: # Decision 1
total -= 10
if is_member: # Decision 2
total *= 0.9
if use_points: # Decision 3
total -= 5
return max(total, 0)
Three independent if statements create 2³ = 8 possible paths:
| Path | has_coupon | is_member | use_points |
|---|---|---|---|
| 1 | F | F | F |
| 2 | F | F | T |
| 3 | F | T | F |
| 4 | F | T | T |
| 5 | T | F | F |
| 6 | T | F | T |
| 7 | T | T | F |
| 8 | T | T | T |
Testing all eight paths catches bugs that only manifest in specific combinations—like a discount calculation that goes negative only when all three discounts apply.
The problem is combinatorial explosion. Add a fourth condition and you have 16 paths. Add a loop that executes 0, 1, or many times, and paths multiply further. Real-world functions with nested conditions and loops can have thousands or millions of theoretical paths.
def complex_validation(a, b, c, d, e, f, g, h):
# 8 independent boolean checks = 256 paths
# This is before considering loops or early returns
Full path coverage is impractical for most code. Use it selectively for critical algorithms—financial calculations, security checks, state machines—where bugs have severe consequences.
Comparing Coverage Types: A Practical Example
Let’s analyze a realistic function under all three coverage types:
def calculate_shipping(weight, distance, is_express, is_fragile):
if weight <= 0 or distance <= 0:
raise ValueError("Invalid parameters")
base_cost = weight * 0.5 + distance * 0.1
if is_express:
base_cost *= 2
if is_fragile:
base_cost += 15
if weight > 10:
base_cost += 10 # Heavy fragile surcharge
return round(base_cost, 2)
Test Suite A: Line-focused
def test_basic_shipping():
assert calculate_shipping(5, 100, True, True) == 55.0
- Line coverage: 100% (all lines execute)
- Branch coverage: 50% (never tests false branches or validation error)
- Path coverage: 12.5% (1 of 8 paths, ignoring validation)
Test Suite B: Branch-focused
def test_invalid_weight():
with pytest.raises(ValueError):
calculate_shipping(0, 100, False, False)
def test_standard_shipping():
assert calculate_shipping(5, 100, False, False) == 12.5
def test_express_shipping():
assert calculate_shipping(5, 100, True, False) == 25.0
def test_fragile_light():
assert calculate_shipping(5, 100, False, True) == 27.5
def test_fragile_heavy():
assert calculate_shipping(15, 100, False, True) == 42.5
- Line coverage: 100%
- Branch coverage: 100% (each branch taken both ways)
- Path coverage: 62.5% (5 of 8 paths)
Test Suite C: Path-focused
Add tests for all combinations: express+fragile, express+fragile+heavy, etc. This catches the bug where express delivery of heavy fragile items might exceed customer expectations due to compounding multipliers.
Tools and Implementation
Every major language has mature coverage tools:
- JavaScript/TypeScript: Istanbul (nyc), c8, Jest’s built-in coverage
- Python: coverage.py, pytest-cov
- Java/Kotlin: JaCoCo, Cobertura
- Go: Built-in
go test -cover - C/C++: gcov, lcov
Here’s a practical configuration for a JavaScript project using Jest:
{
"jest": {
"collectCoverage": true,
"coverageThreshold": {
"global": {
"branches": 80,
"functions": 85,
"lines": 85,
"statements": 85
},
"src/payments/**/*.js": {
"branches": 95,
"lines": 95
}
},
"coverageReporters": ["text", "lcov", "html"]
}
}
Note the per-directory thresholds. Critical code paths deserve higher standards than utility functions.
For Python with pytest:
# pytest.ini
[pytest]
addopts = --cov=src --cov-branch --cov-fail-under=80
The --cov-branch flag is crucial—it enables branch coverage instead of just line coverage.
Best Practices and Pitfalls
Set realistic thresholds. 80% line coverage and 70% branch coverage are reasonable targets for most codebases. 100% is achievable but often not worth the effort—you’ll spend hours testing trivial getters and error handlers that are obvious by inspection.
Measure branch coverage, not just lines. Configure your tools to report branch coverage. The difference between 90% line coverage and 90% branch coverage can be dozens of untested conditional paths.
Watch for coverage theater. Tests that execute code without meaningful assertions inflate coverage without improving quality. Review coverage reports alongside the actual tests.
# This "test" achieves coverage but tests nothing
def test_coverage_theater():
result = complex_calculation(1, 2, 3)
assert True # Always passes
Use path coverage surgically. Identify your highest-risk code—payment processing, authentication, data validation—and invest in thorough path coverage there. Apply lighter standards to low-risk utilities.
Coverage is a floor, not a ceiling. Meeting your coverage threshold doesn’t mean testing is done. It means you’ve met the minimum bar. The goal is finding bugs, not painting coverage reports green.
Coverage metrics guide your testing efforts. They highlight untested code, reveal complex branching logic that needs attention, and prevent test suites from rotting over time. But they’re a means to an end—working software—not the end itself.