Poisson Distribution in Python: Complete Guide

Key Insights

The Poisson distribution models the probability of a given number of events occurring in a fixed interval, with the single parameter λ representing both the mean and variance.
SciPy’s scipy.stats.poisson module provides everything you need: probability calculations, random sampling, and statistical properties—all in a clean, consistent API.
When your observed variance significantly exceeds the mean (overdispersion), the Poisson model breaks down, and you should consider alternatives like the negative binomial distribution.

Introduction to the Poisson Distribution

The Poisson distribution answers a specific question: given that events occur independently at a constant average rate, what’s the probability of observing exactly k events in a fixed interval?

This makes it ideal for modeling:

Customer arrivals at a store per hour
HTTP requests hitting a server per second
Manufacturing defects per batch
Typos per page in a document
Traffic accidents per month at an intersection

Three assumptions must hold for Poisson to apply correctly:

Events occur independently of each other
The average rate (λ) remains constant over the interval
Two events cannot occur at exactly the same instant

When these assumptions hold, Poisson provides remarkably accurate predictions. When they don’t, you’ll get misleading results—something we’ll address at the end.

Mathematical Foundation

The Poisson probability mass function (PMF) gives the probability of observing exactly k events:

$$P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}$$

Where:

λ (lambda) is the average number of events per interval
k is the specific number of events you’re calculating probability for
e is Euler’s number (~2.71828)

A key property that distinguishes Poisson: the mean equals the variance. Both equal λ. This property becomes important for model validation—if your data shows variance much larger than the mean, Poisson isn’t appropriate.

The Poisson distribution connects to others in useful ways. It’s the limiting case of the binomial distribution when n is large and p is small (with λ = np). The time between Poisson events follows an exponential distribution with rate λ.

Here’s how to calculate the PMF manually:

import math
import numpy as np

def poisson_pmf(k: int, lam: float) -> float:
    """Calculate Poisson probability manually."""
    return (lam ** k) * math.exp(-lam) / math.factorial(k)

# Probability of exactly 3 events when average is 5
lam = 5
k = 3
prob = poisson_pmf(k, lam)
print(f"P(X = {k}) when λ = {lam}: {prob:.4f}")
# Output: P(X = 3) when λ = 5: 0.1404

# Calculate probabilities for k = 0 to 10
for k in range(11):
    print(f"P(X = {k}): {poisson_pmf(k, lam):.4f}")

This manual approach helps understand the math, but for production code, use SciPy.

Implementing Poisson in Python with SciPy

The scipy.stats.poisson module provides a complete implementation. Here are the methods you’ll use most:

from scipy import stats
import numpy as np

# Create a Poisson distribution with λ = 4
lam = 4
poisson_dist = stats.poisson(mu=lam)

# PMF: Probability of exactly k events
print(f"P(X = 3): {poisson_dist.pmf(3):.4f}")  # 0.1954
print(f"P(X = 4): {poisson_dist.pmf(4):.4f}")  # 0.1954

# CDF: Probability of k or fewer events
print(f"P(X ≤ 3): {poisson_dist.cdf(3):.4f}")  # 0.4335
print(f"P(X ≤ 6): {poisson_dist.cdf(6):.4f}")  # 0.8893

# Survival function: P(X > k) = 1 - CDF(k)
print(f"P(X > 6): {poisson_dist.sf(6):.4f}")   # 0.1107

# Statistical properties
print(f"Mean: {poisson_dist.mean()}")          # 4.0
print(f"Variance: {poisson_dist.var()}")       # 4.0
print(f"Std Dev: {poisson_dist.std():.4f}")    # 2.0

# Generate random samples
np.random.seed(42)
samples = poisson_dist.rvs(size=1000)
print(f"Sample mean: {samples.mean():.3f}")    # ~4.0
print(f"Sample var: {samples.var():.3f}")      # ~4.0

You can also use the module directly without creating a distribution object:

# Direct function calls
prob = stats.poisson.pmf(k=3, mu=4)
cumulative = stats.poisson.cdf(k=5, mu=4)
random_values = stats.poisson.rvs(mu=4, size=100)

For calculating probabilities over ranges, vectorize your operations:

# Probability of 2 to 5 events (inclusive)
k_values = np.arange(2, 6)
probs = stats.poisson.pmf(k_values, mu=4)
total_prob = probs.sum()
print(f"P(2 ≤ X ≤ 5): {total_prob:.4f}")  # 0.6289

# Alternative using CDF
prob_range = poisson_dist.cdf(5) - poisson_dist.cdf(1)
print(f"P(2 ≤ X ≤ 5): {prob_range:.4f}")  # 0.6289

Visualizing the Poisson Distribution

Visualization reveals how λ shapes the distribution. Small λ values produce right-skewed distributions; larger values approach symmetry (and approximate normality).

import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# Different lambda values to compare
lambdas = [2, 5, 10]
colors = ['#e74c3c', '#3498db', '#2ecc71']
k_max = 20

# PMF comparison
ax1 = axes[0]
for lam, color in zip(lambdas, colors):
    k = np.arange(0, k_max + 1)
    pmf = stats.poisson.pmf(k, mu=lam)
    ax1.bar(k + lambdas.index(lam) * 0.25, pmf, width=0.25, 
            label=f'λ = {lam}', color=color, alpha=0.8)

ax1.set_xlabel('k (number of events)')
ax1.set_ylabel('P(X = k)')
ax1.set_title('Poisson PMF for Different λ Values')
ax1.legend()
ax1.set_xlim(-0.5, k_max)

# CDF comparison
ax2 = axes[1]
for lam, color in zip(lambdas, colors):
    k = np.arange(0, k_max + 1)
    cdf = stats.poisson.cdf(k, mu=lam)
    ax2.step(k, cdf, where='mid', label=f'λ = {lam}', 
             color=color, linewidth=2)

ax2.set_xlabel('k (number of events)')
ax2.set_ylabel('P(X ≤ k)')
ax2.set_title('Poisson CDF for Different λ Values')
ax2.legend()
ax2.set_xlim(-0.5, k_max)
ax2.set_ylim(0, 1.05)

plt.tight_layout()
plt.savefig('poisson_comparison.png', dpi=150)
plt.show()

Notice how λ = 2 is heavily right-skewed, while λ = 10 looks nearly symmetric. This convergence to normality accelerates as λ increases—a useful approximation when λ > 20.

Parameter Estimation (MLE)

Given observed count data, the maximum likelihood estimate for λ is simply the sample mean. This makes intuitive sense: your best estimate of the true average rate is the observed average.

import numpy as np
from scipy import stats

# Simulated observed data: daily customer complaints over 30 days
np.random.seed(42)
observed_data = np.array([3, 5, 2, 4, 6, 3, 4, 5, 2, 3,
                          4, 5, 3, 4, 2, 5, 6, 4, 3, 4,
                          5, 3, 4, 2, 5, 4, 3, 6, 4, 5])

# MLE estimate: sample mean
lambda_mle = observed_data.mean()
print(f"MLE estimate of λ: {lambda_mle:.3f}")

# Standard error of the estimate
se_lambda = np.sqrt(lambda_mle / len(observed_data))
print(f"Standard error: {se_lambda:.3f}")

# 95% confidence interval (using normal approximation)
z_critical = 1.96
ci_lower = lambda_mle - z_critical * se_lambda
ci_upper = lambda_mle + z_critical * se_lambda
print(f"95% CI: [{ci_lower:.3f}, {ci_upper:.3f}]")

# Goodness of fit: compare observed vs expected variance
observed_var = observed_data.var()
print(f"\nObserved mean: {lambda_mle:.3f}")
print(f"Observed variance: {observed_var:.3f}")
print(f"Variance/Mean ratio: {observed_var/lambda_mle:.3f}")
# Ratio close to 1.0 suggests Poisson is appropriate

For a formal goodness-of-fit test, use the chi-squared test:

from scipy.stats import chisquare

# Group observations into bins
max_k = int(observed_data.max()) + 1
observed_freq = np.bincount(observed_data, minlength=max_k + 1)

# Expected frequencies under Poisson
expected_prob = stats.poisson.pmf(np.arange(max_k + 1), mu=lambda_mle)
expected_freq = expected_prob * len(observed_data)

# Combine bins with expected frequency < 5
# (simplified: just check the test)
chi2_stat, p_value = chisquare(observed_freq, expected_freq)
print(f"\nChi-squared statistic: {chi2_stat:.3f}")
print(f"P-value: {p_value:.3f}")

Practical Applications

Let’s work through a realistic scenario: capacity planning for a web server.

Your server receives an average of 50 requests per second during peak hours. You need to determine how much capacity headroom to maintain.

import numpy as np
from scipy import stats

# Average requests per second
lambda_rate = 50

# Create the distribution
request_dist = stats.poisson(mu=lambda_rate)

# What's the probability of receiving more than 60 requests?
prob_over_60 = request_dist.sf(60)
print(f"P(requests > 60): {prob_over_60:.4f}")  # ~0.0722

# What's the probability of receiving more than 70 requests?
prob_over_70 = request_dist.sf(70)
print(f"P(requests > 70): {prob_over_70:.4f}")  # ~0.0028

# Find the 99th percentile (capacity planning target)
percentile_99 = request_dist.ppf(0.99)
print(f"99th percentile: {percentile_99} requests")  # 67

# Simulate one hour of traffic (3600 seconds)
np.random.seed(42)
hourly_traffic = request_dist.rvs(size=3600)

# Analyze the simulation
print(f"\nHourly simulation results:")
print(f"Total requests: {hourly_traffic.sum():,}")
print(f"Max in any second: {hourly_traffic.max()}")
print(f"Seconds exceeding 65 requests: {(hourly_traffic > 65).sum()}")

# Calculate probability of at least one overload event
# (assuming capacity = 70 requests/second)
capacity = 70
prob_single_overload = request_dist.sf(capacity)
prob_no_overload_hour = (1 - prob_single_overload) ** 3600
prob_at_least_one_overload = 1 - prob_no_overload_hour
print(f"\nP(at least one overload in an hour): {prob_at_least_one_overload:.4f}")

This analysis tells you that with capacity for 70 requests/second, you’ll see overload events roughly 0.28% of individual seconds—but over an hour, there’s still a meaningful probability of at least one spike exceeding capacity.

Limitations and Alternatives

The Poisson distribution fails when its assumptions break down. The most common issue is overdispersion: when variance exceeds the mean.

import numpy as np
from scipy import stats

# Example: overdispersed data (variance >> mean)
overdispersed_data = np.array([0, 0, 1, 0, 12, 0, 1, 0, 0, 15,
                               0, 0, 2, 0, 0, 18, 0, 1, 0, 0])

mean_val = overdispersed_data.mean()
var_val = overdispersed_data.var()
dispersion_ratio = var_val / mean_val

print(f"Mean: {mean_val:.2f}")
print(f"Variance: {var_val:.2f}")
print(f"Dispersion ratio: {dispersion_ratio:.2f}")
# Ratio >> 1 indicates overdispersion

# Negative binomial as alternative
# Fit using method of moments
n_param = mean_val ** 2 / (var_val - mean_val)
p_param = mean_val / var_val

print(f"\nNegative binomial parameters:")
print(f"n: {n_param:.3f}, p: {p_param:.3f}")

Consider these alternatives when Poisson doesn’t fit:

Negative binomial: Handles overdispersion by adding a dispersion parameter
Zero-inflated Poisson: For data with excess zeros (e.g., insurance claims)
Conway-Maxwell-Poisson: Handles both over- and under-dispersion

The Poisson distribution remains your go-to model for count data, but always verify the mean-variance relationship before trusting its predictions.