How to Perform the Breusch-Pagan Test in Python

Key Insights

The Breusch-Pagan test detects heteroscedasticity by regressing squared residuals against your independent variables—a significant p-value (< 0.05) indicates non-constant variance that violates OLS assumptions.
Statsmodels provides het_breuschpagan() which returns four values: use the Lagrange Multiplier p-value for large samples and the F-statistic p-value for smaller datasets.
When heteroscedasticity is present, don’t abandon your model—use robust standard errors (HC3) or weighted least squares to obtain valid inference without changing your coefficient estimates.

Introduction to Heteroscedasticity

Ordinary Least Squares regression assumes that the variance of your residuals remains constant across all levels of your independent variables. This property is called homoscedasticity. When this assumption fails—when residual variance changes systematically—you have heteroscedasticity.

Why does this matter? Heteroscedasticity doesn’t bias your coefficient estimates, but it invalidates your standard errors. This means your confidence intervals are wrong, your t-statistics are unreliable, and your p-values are misleading. You might conclude a variable is significant when it isn’t, or miss genuine effects entirely.

Let’s visualize the difference:

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)

# Homoscedastic data: constant variance
x_homo = np.linspace(1, 10, 100)
y_homo = 2 * x_homo + np.random.normal(0, 2, 100)

# Heteroscedastic data: variance increases with x
x_hetero = np.linspace(1, 10, 100)
y_hetero = 2 * x_hetero + np.random.normal(0, x_hetero * 0.8, 100)

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

axes[0].scatter(x_homo, y_homo, alpha=0.6)
axes[0].set_title('Homoscedastic Residuals')
axes[0].set_xlabel('X')
axes[0].set_ylabel('Y')

axes[1].scatter(x_hetero, y_hetero, alpha=0.6)
axes[1].set_title('Heteroscedastic Residuals')
axes[1].set_xlabel('X')
axes[1].set_ylabel('Y')

plt.tight_layout()
plt.show()

The left plot shows residuals with consistent spread. The right plot shows the classic “fan” or “cone” pattern where variance increases with the predictor. This visual inspection is useful, but we need a formal statistical test to make defensible decisions.

What is the Breusch-Pagan Test?

The Breusch-Pagan test, developed by Trevor Breusch and Adrian Pagan in 1979, provides a formal hypothesis test for heteroscedasticity. The test works by examining whether the squared residuals from your regression can be predicted by your independent variables.

Null Hypothesis (H₀): Homoscedasticity is present (constant variance) Alternative Hypothesis (H₁): Heteroscedasticity is present (non-constant variance)

The procedure is straightforward:

Fit your OLS regression and obtain the residuals
Square the residuals
Regress the squared residuals on your original independent variables
Test whether this auxiliary regression has explanatory power

If your independent variables can predict the squared residuals, that’s evidence that variance depends on those variables—heteroscedasticity.

The test produces a Lagrange Multiplier (LM) statistic that follows a chi-squared distribution under the null hypothesis. A small p-value (typically < 0.05) leads you to reject the null and conclude that heteroscedasticity is present.

Setting Up Your Environment

You’ll need these libraries:

import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.stats.diagnostic import het_breuschpagan
from scipy import stats

# Check versions for reproducibility
print(f"NumPy: {np.__version__}")
print(f"Pandas: {pd.__version__}")
print(f"Statsmodels: {sm.__version__}")

Install missing packages with:

pip install numpy pandas statsmodels scipy

Let’s create two datasets—one with homoscedastic errors and one with clear heteroscedasticity:

np.random.seed(123)
n = 200

# Dataset 1: Homoscedastic (constant variance)
X1 = np.random.uniform(1, 10, n)
epsilon1 = np.random.normal(0, 3, n)  # Constant variance
y1 = 5 + 2.5 * X1 + epsilon1

df_homo = pd.DataFrame({'X': X1, 'y': y1})

# Dataset 2: Heteroscedastic (variance increases with X)
X2 = np.random.uniform(1, 10, n)
epsilon2 = np.random.normal(0, 1, n) * X2  # Variance proportional to X
y2 = 5 + 2.5 * X2 + epsilon2

df_hetero = pd.DataFrame({'X': X2, 'y': y2})

print("Homoscedastic dataset shape:", df_homo.shape)
print("Heteroscedastic dataset shape:", df_hetero.shape)

Running the Breusch-Pagan Test with Statsmodels

The het_breuschpagan() function requires two inputs: the residuals from your fitted model and the design matrix (independent variables including the constant). Here’s the complete workflow:

def run_breusch_pagan_test(X, y, dataset_name="Dataset"):
    """
    Fit OLS model and run Breusch-Pagan test.
    
    Parameters:
    -----------
    X : array-like
        Independent variable(s)
    y : array-like
        Dependent variable
    dataset_name : str
        Label for output
    
    Returns:
    --------
    dict : Test results
    """
    # Add constant term for intercept
    X_with_const = sm.add_constant(X)
    
    # Fit OLS model
    model = sm.OLS(y, X_with_const)
    results = model.fit()
    
    # Extract residuals and exogenous variables
    residuals = results.resid
    exog = results.model.exog
    
    # Run Breusch-Pagan test
    lm_stat, lm_pvalue, f_stat, f_pvalue = het_breuschpagan(residuals, exog)
    
    print(f"\n{'='*50}")
    print(f"Breusch-Pagan Test Results: {dataset_name}")
    print(f"{'='*50}")
    print(f"Lagrange Multiplier Statistic: {lm_stat:.4f}")
    print(f"LM Test p-value:               {lm_pvalue:.4f}")
    print(f"F-Statistic:                   {f_stat:.4f}")
    print(f"F-Test p-value:                {f_pvalue:.4f}")
    
    return {
        'lm_stat': lm_stat,
        'lm_pvalue': lm_pvalue,
        'f_stat': f_stat,
        'f_pvalue': f_pvalue,
        'model_results': results
    }

# Test on homoscedastic data
results_homo = run_breusch_pagan_test(
    df_homo['X'], 
    df_homo['y'], 
    "Homoscedastic Data"
)

# Test on heteroscedastic data
results_hetero = run_breusch_pagan_test(
    df_hetero['X'], 
    df_hetero['y'], 
    "Heteroscedastic Data"
)

The function returns four values:

LM Statistic: The Lagrange Multiplier test statistic (chi-squared distributed)
LM p-value: P-value for the LM test—use this for large samples
F-Statistic: Alternative test statistic (F-distributed)
F p-value: P-value for the F-test—more reliable for smaller samples

For most applications with n > 30, the LM test works well. With smaller samples, prefer the F-test.

Interpreting Results

Let’s create a robust interpretation function:

def interpret_breusch_pagan(lm_pvalue, f_pvalue, alpha=0.05):
    """
    Interpret Breusch-Pagan test results with clear conclusions.
    
    Parameters:
    -----------
    lm_pvalue : float
        P-value from LM test
    f_pvalue : float
        P-value from F test
    alpha : float
        Significance level (default 0.05)
    """
    print(f"\nInterpretation (α = {alpha}):")
    print("-" * 40)
    
    # Primary decision based on LM test
    if lm_pvalue < alpha:
        print(f"LM p-value ({lm_pvalue:.4f}) < {alpha}")
        print("→ REJECT null hypothesis")
        print("→ Evidence of HETEROSCEDASTICITY")
        print("\nRecommendation:")
        print("  • Use robust standard errors (HC3)")
        print("  • Consider weighted least squares")
        print("  • Check for model misspecification")
        decision = "heteroscedastic"
    else:
        print(f"LM p-value ({lm_pvalue:.4f}) >= {alpha}")
        print("→ FAIL TO REJECT null hypothesis")
        print("→ No significant evidence of heteroscedasticity")
        print("\nRecommendation:")
        print("  • Standard OLS inference is appropriate")
        print("  • Proceed with usual standard errors")
        decision = "homoscedastic"
    
    # Cross-check with F-test
    f_decision = "heteroscedastic" if f_pvalue < alpha else "homoscedastic"
    if f_decision != decision:
        print(f"\n⚠ Warning: F-test (p={f_pvalue:.4f}) suggests different conclusion")
        print("  Consider sample size and use F-test for small samples")
    
    return decision

# Interpret both datasets
print("\n" + "="*60)
print("HOMOSCEDASTIC DATA INTERPRETATION")
interpret_breusch_pagan(
    results_homo['lm_pvalue'], 
    results_homo['f_pvalue']
)

print("\n" + "="*60)
print("HETEROSCEDASTIC DATA INTERPRETATION")
interpret_breusch_pagan(
    results_hetero['lm_pvalue'], 
    results_hetero['f_pvalue']
)

You should see the homoscedastic dataset pass the test (high p-value, fail to reject null) while the heteroscedastic dataset fails (low p-value, reject null).

Addressing Heteroscedasticity

When the Breusch-Pagan test indicates heteroscedasticity, you have several options. The most practical for inference is using heteroscedasticity-consistent (robust) standard errors:

def compare_standard_errors(X, y):
    """
    Compare OLS standard errors with robust alternatives.
    """
    X_const = sm.add_constant(X)
    model = sm.OLS(y, X_const).fit()
    
    # Robust standard errors using HC3 (recommended for general use)
    model_robust = sm.OLS(y, X_const).fit(cov_type='HC3')
    
    print("Coefficient Comparison:")
    print("-" * 60)
    print(f"{'Parameter':<15} {'Coef':<10} {'OLS SE':<12} {'Robust SE':<12}")
    print("-" * 60)
    
    params = ['const', 'X']
    for i, param in enumerate(params):
        coef = model.params[i]
        ols_se = model.bse[i]
        robust_se = model_robust.bse[i]
        change = ((robust_se - ols_se) / ols_se) * 100
        
        print(f"{param:<15} {coef:<10.4f} {ols_se:<12.4f} {robust_se:<12.4f} ({change:+.1f}%)")
    
    print("\nOLS Model Summary (Standard SE):")
    print(f"  R-squared: {model.rsquared:.4f}")
    print(f"  X coefficient p-value: {model.pvalues[1]:.4f}")
    
    print("\nRobust Model Summary (HC3 SE):")
    print(f"  R-squared: {model_robust.rsquared:.4f}")
    print(f"  X coefficient p-value: {model_robust.pvalues[1]:.4f}")
    
    return model, model_robust

print("\nHeteroscedastic Data - Standard Errors Comparison:")
print("="*60)
model_ols, model_hc3 = compare_standard_errors(df_hetero['X'], df_hetero['y'])

The HC3 estimator (also known as MacKinnon-White) is the recommended default. It performs well across sample sizes and heteroscedasticity patterns. Notice that the coefficient estimates remain identical—only the standard errors change.

For severe heteroscedasticity, consider weighted least squares:

def weighted_least_squares_example(X, y):
    """
    Demonstrate WLS when variance structure is known.
    """
    X_const = sm.add_constant(X)
    
    # If variance is proportional to X, weights should be 1/X
    weights = 1 / X
    
    # Fit WLS
    model_wls = sm.WLS(y, X_const, weights=weights).fit()
    
    print("\nWeighted Least Squares Results:")
    print("-" * 40)
    print(model_wls.summary().tables[1])
    
    return model_wls

wls_model = weighted_least_squares_example(df_hetero['X'], df_hetero['y'])

WLS requires knowing or estimating the variance structure. When that’s unclear, stick with robust standard errors.

Conclusion

The Breusch-Pagan test should be part of your standard regression diagnostic toolkit. Run it after fitting any OLS model where you care about inference—hypothesis tests, confidence intervals, or prediction intervals.

Key practices to follow:

Always visualize residuals first. Plot residuals against fitted values and each predictor. The Breusch-Pagan test confirms what you see.
Use appropriate p-value thresholds. The standard α = 0.05 works for most cases, but consider your context. In exploratory analysis, you might tolerate more risk.
Don’t over-correct. If the test doesn’t reject homoscedasticity, use standard errors. Robust standard errors are less efficient when homoscedasticity holds.
Combine with other diagnostics. The Breusch-Pagan test specifically checks if variance relates to your regressors. The White test is more general. Use both when thoroughness matters.
Address the root cause when possible. Sometimes heteroscedasticity signals model misspecification—a missing variable, wrong functional form, or outliers. Fix the model before reaching for robust standard errors.