How to Calculate the Probability of a Union

Key Insights

Union probability P(A ∪ B) represents the likelihood of at least one event occurring, calculated as P(A) + P(B) - P(A ∩ B) to avoid double-counting overlaps
The intersection term P(A ∩ B) is critical—forgetting it is the most common mistake and leads to probabilities exceeding 1.0 in real applications
For multiple events, use the inclusion-exclusion principle, which alternates adding individual probabilities and subtracting overlaps of increasing size

Introduction to Union Probability

Union probability answers a fundamental question: what’s the chance that at least one of several events occurs? In notation, P(A ∪ B) represents the probability that event A happens, event B happens, or both happen.

This concept appears constantly in production software. When running A/B tests, you might need to know the probability a user engages with either feature variant. In monitoring systems, you calculate the likelihood of experiencing at least one type of failure. In user analytics, you determine how many users match at least one targeting criterion.

Unlike intersection probability (both events occurring), union probability is inclusive—we’re casting a wider net. Understanding how to calculate it correctly prevents serious bugs in analytics pipelines, risk models, and decision systems.

The Addition Rule for Two Events

The fundamental formula for union probability is:

P(A ∪ B) = P(A) + P(B) - P(A ∩ B)

Why subtract the intersection? When you add P(A) and P(B), you count the overlap twice—once in each probability. The intersection P(A ∩ B) represents outcomes where both events occur simultaneously, so we subtract it once to correct the double-counting.

Consider a SaaS application where 40% of users enable dark mode (A) and 30% enable notifications (B), with 15% enabling both. The probability a random user has at least one feature enabled is:

P(A ∪ B) = 0.40 + 0.30 - 0.15 = 0.55 (55%)

If you forgot the subtraction, you’d incorrectly calculate 70%, which doesn’t account for the overlap.

Here’s a practical implementation with visualization:

import matplotlib.pyplot as plt
import matplotlib.patches as patches
import numpy as np

def calculate_union_probability(p_a, p_b, p_intersection):
    """
    Calculate P(A ∪ B) using the addition rule.
    
    Args:
        p_a: Probability of event A
        p_b: Probability of event B
        p_intersection: Probability of both A and B
    
    Returns:
        Union probability
    """
    if not (0 <= p_a <= 1 and 0 <= p_b <= 1 and 0 <= p_intersection <= 1):
        raise ValueError("Probabilities must be between 0 and 1")
    
    if p_intersection > min(p_a, p_b):
        raise ValueError("Intersection cannot exceed either individual probability")
    
    union = p_a + p_b - p_intersection
    
    return {
        'union': union,
        'p_a_only': p_a - p_intersection,
        'p_b_only': p_b - p_intersection,
        'p_both': p_intersection,
        'p_neither': 1 - union
    }

def visualize_union(p_a, p_b, p_intersection):
    """Create a Venn diagram visualization of the union."""
    fig, ax = plt.subplots(figsize=(10, 6))
    
    # Create circles
    circle_a = patches.Circle((0.35, 0.5), 0.25, alpha=0.5, color='blue', label='A')
    circle_b = patches.Circle((0.65, 0.5), 0.25, alpha=0.5, color='red', label='B')
    
    ax.add_patch(circle_a)
    ax.add_patch(circle_b)
    
    result = calculate_union_probability(p_a, p_b, p_intersection)
    
    # Add text annotations
    ax.text(0.25, 0.5, f"{result['p_a_only']:.2f}", fontsize=12, ha='center')
    ax.text(0.5, 0.5, f"{result['p_both']:.2f}", fontsize=12, ha='center')
    ax.text(0.75, 0.5, f"{result['p_b_only']:.2f}", fontsize=12, ha='center')
    ax.text(0.5, 0.9, f"P(A ∪ B) = {result['union']:.2f}", fontsize=14, ha='center', weight='bold')
    
    ax.set_xlim(0, 1)
    ax.set_ylim(0, 1)
    ax.set_aspect('equal')
    ax.axis('off')
    plt.legend()
    plt.title('Union Probability Visualization')
    plt.tight_layout()
    
    return result

# Example usage
result = visualize_union(0.40, 0.30, 0.15)
print(f"Union probability: {result['union']}")

Mutually Exclusive Events (Special Case)

Mutually exclusive events cannot occur simultaneously—P(A ∩ B) = 0. Think of rolling a die: getting a 2 and getting a 5 are mutually exclusive outcomes.

When events are mutually exclusive, the formula simplifies to:

P(A ∪ B) = P(A) + P(B)

This is straightforward addition because there’s no overlap to subtract. However, incorrectly assuming mutual exclusivity is a common error. User behaviors rarely exclude each other—users can enable multiple features, encounter multiple error types, or belong to multiple segments.

def compare_exclusive_vs_overlapping():
    """Demonstrate the difference with dice rolling examples."""
    
    # Mutually exclusive: rolling a 2 OR a 5
    p_roll_2 = 1/6
    p_roll_5 = 1/6
    p_intersection_exclusive = 0  # Can't roll both simultaneously
    
    exclusive_union = p_roll_2 + p_roll_5 - p_intersection_exclusive
    print(f"Mutually exclusive (roll 2 or 5): {exclusive_union:.4f}")
    print(f"Simplified calculation: {p_roll_2 + p_roll_5:.4f}\n")
    
    # Non-exclusive: rolling even OR rolling > 3
    # Even: {2, 4, 6}, >3: {4, 5, 6}, Intersection: {4, 6}
    p_even = 3/6
    p_greater_than_3 = 3/6
    p_intersection = 2/6  # Both even AND >3
    
    overlapping_union = p_even + p_greater_than_3 - p_intersection
    print(f"Overlapping events (even or >3): {overlapping_union:.4f}")
    print(f"Without subtraction (WRONG): {p_even + p_greater_than_3:.4f}")
    print(f"Outcomes: {2, 4, 5, 6} = 4/6 = {4/6:.4f}")
    
    return {
        'exclusive': exclusive_union,
        'overlapping': overlapping_union
    }

compare_exclusive_vs_overlapping()

Union of Multiple Events

For three or more events, we use the inclusion-exclusion principle. For three events:

P(A ∪ B ∪ C) = P(A) + P(B) + P(C) - P(A ∩ B) - P(A ∩ C) - P(B ∩ C) + P(A ∩ B ∩ C)

The pattern alternates: add individual probabilities, subtract pairwise intersections, add back three-way intersections, and so on. Each level corrects for overcounting at the previous level.

Here’s a general implementation:

from itertools import combinations
from typing import Dict, Set

def union_probability_multiple(
    individual_probs: Dict[str, float],
    intersections: Dict[frozenset, float]
) -> float:
    """
    Calculate union probability for n events using inclusion-exclusion.
    
    Args:
        individual_probs: Dict mapping event names to probabilities
        intersections: Dict mapping frozensets of event names to intersection probabilities
    
    Returns:
        Union probability
    """
    events = list(individual_probs.keys())
    n = len(events)
    union = 0.0
    
    # Iterate through all subset sizes
    for size in range(1, n + 1):
        sign = 1 if size % 2 == 1 else -1
        
        # Generate all combinations of this size
        for combo in combinations(events, size):
            combo_set = frozenset(combo)
            
            if size == 1:
                # Individual probabilities
                prob = individual_probs[combo[0]]
            else:
                # Intersection probabilities
                prob = intersections.get(combo_set, 0.0)
            
            union += sign * prob
    
    return union

# Real-world example: feature adoption
# Features: A (dark mode), B (notifications), C (analytics)
individual_probs = {
    'dark_mode': 0.40,
    'notifications': 0.30,
    'analytics': 0.25
}

intersections = {
    frozenset(['dark_mode', 'notifications']): 0.15,
    frozenset(['dark_mode', 'analytics']): 0.12,
    frozenset(['notifications', 'analytics']): 0.10,
    frozenset(['dark_mode', 'notifications', 'analytics']): 0.05
}

adoption_rate = union_probability_multiple(individual_probs, intersections)
print(f"At least one feature enabled: {adoption_rate:.2%}")
# Output: At least one feature enabled: 63.00%

Practical Applications

Union probability calculations power critical business metrics. Here’s a production-ready analytics class:

class UnionProbabilityCalculator {
    /**
     * Calculate union probability for two events with validation.
     */
    static twoEvents(pA: number, pB: number, pIntersection: number): number {
        this.validate(pA, pB, pIntersection);
        
        if (pIntersection > Math.min(pA, pB)) {
            throw new Error('Intersection cannot exceed min(P(A), P(B))');
        }
        
        return pA + pB - pIntersection;
    }
    
    /**
     * Calculate probability of at least one error occurring.
     * Use case: SLA monitoring with multiple failure modes.
     */
    static errorMonitoring(errorRates: Map<string, number>): number {
        // Assuming independence for conservative estimate
        const noneOccur = Array.from(errorRates.values())
            .reduce((acc, rate) => acc * (1 - rate), 1);
        
        return 1 - noneOccur;
    }
    
    /**
     * Calculate user segment overlap.
     */
    static segmentOverlap(
        segments: Record<string, number>,
        overlaps: Record<string, number>
    ): number {
        const segmentNames = Object.keys(segments);
        
        if (segmentNames.length === 2) {
            const [a, b] = segmentNames;
            const intersection = overlaps[`${a}_${b}`] || 0;
            return this.twoEvents(segments[a], segments[b], intersection);
        }
        
        // For more segments, use inclusion-exclusion
        // Implementation similar to Python version
        throw new Error('Multiple segment calculation not shown for brevity');
    }
    
    private static validate(...probs: number[]): void {
        for (const p of probs) {
            if (p < 0 || p > 1) {
                throw new Error(`Invalid probability: ${p}`);
            }
        }
    }
}

// Example: API error monitoring
const errorRates = new Map([
    ['database_timeout', 0.02],
    ['rate_limit', 0.01],
    ['network_error', 0.015]
]);

const anyErrorProb = UnionProbabilityCalculator.errorMonitoring(errorRates);
console.log(`Probability of any error: ${(anyErrorProb * 100).toFixed(2)}%`);

Common Pitfalls and Best Practices

The most frequent mistake is forgetting the intersection term. This causes calculated probabilities to exceed 1.0, which is mathematically impossible. Always validate your outputs.

Another error is assuming independence when calculating intersections. If events A and B are independent, P(A ∩ B) = P(A) × P(B). But user behaviors, system failures, and business events often correlate. Measure actual intersections from data rather than assuming independence.

Floating-point arithmetic introduces precision errors. When probabilities should sum to 1.0, they might equal 0.9999999 or 1.0000001. Use epsilon comparisons for validation.

import unittest

class TestUnionProbability(unittest.TestCase):
    
    def test_basic_union(self):
        """Test standard union calculation."""
        result = calculate_union_probability(0.5, 0.5, 0.25)
        self.assertAlmostEqual(result['union'], 0.75)
    
    def test_mutually_exclusive(self):
        """Test mutually exclusive events."""
        result = calculate_union_probability(0.3, 0.4, 0.0)
        self.assertAlmostEqual(result['union'], 0.7)
    
    def test_complete_overlap(self):
        """Test when one event is subset of another."""
        result = calculate_union_probability(0.6, 0.3, 0.3)
        self.assertAlmostEqual(result['union'], 0.6)
    
    def test_invalid_intersection(self):
        """Test that intersection > min(P(A), P(B)) raises error."""
        with self.assertRaises(ValueError):
            calculate_union_probability(0.3, 0.4, 0.5)
    
    def test_probability_bounds(self):
        """Ensure union probability stays within [0, 1]."""
        result = calculate_union_probability(0.8, 0.9, 0.7)
        self.assertGreaterEqual(result['union'], 0.0)
        self.assertLessEqual(result['union'], 1.0)
    
    def test_floating_point_precision(self):
        """Handle floating-point arithmetic edge cases."""
        result = calculate_union_probability(0.1, 0.2, 0.05)
        # Use epsilon comparison
        self.assertTrue(abs(result['union'] - 0.25) < 1e-10)

if __name__ == '__main__':
    unittest.main()

Union probability is fundamental to data-driven decision making. Implement it correctly with proper validation, understand when events are truly independent, and always account for overlaps. Your analytics will be more accurate, your monitoring more reliable, and your A/B tests more trustworthy.