How to Perform the Ramsey RESET Test in Python

You've built a linear regression model. The R-squared looks decent, residuals seem reasonable, and coefficients make intuitive sense. But here's the uncomfortable question: is your linear...

Key Insights

  • The Ramsey RESET test detects functional form misspecification in linear regression models by checking whether powers of fitted values add explanatory power—if they do, your model is likely missing nonlinear relationships.
  • Python’s statsmodels library provides linear_reset() for running the test in one line, but understanding the output requires knowing that a low p-value (typically < 0.05) signals your linear specification is inadequate.
  • When RESET fails, don’t panic—the fix often involves adding polynomial terms, applying log transformations, or including interaction variables rather than abandoning linear regression entirely.

Introduction to the Ramsey RESET Test

You’ve built a linear regression model. The R-squared looks decent, residuals seem reasonable, and coefficients make intuitive sense. But here’s the uncomfortable question: is your linear specification actually correct, or are you missing important nonlinear relationships in your data?

The Ramsey RESET test (Regression Equation Specification Error Test), developed by James Ramsey in 1969, directly addresses this concern. It’s a diagnostic tool that detects functional form misspecification—situations where the true relationship between variables is nonlinear, but you’ve forced a linear model onto it.

Use RESET when you suspect your model might be missing:

  • Quadratic or polynomial relationships
  • Logarithmic transformations
  • Interaction effects between variables
  • Any nonlinear functional form

The test won’t tell you what the correct specification is, but it will tell you whether your current linear model is likely wrong. That’s valuable information before you start interpreting coefficients or making predictions.

The Theory Behind RESET

The logic behind RESET is elegant. If your linear model is correctly specified, then the fitted values (ŷ) should capture all the systematic variation in your dependent variable. Adding powers of these fitted values (ŷ², ŷ³) to your regression shouldn’t improve the model because there’s nothing left to explain.

However, if your model is misspecified—say, the true relationship is quadratic but you’ve fit a linear model—then ŷ² will correlate with the omitted nonlinear component. Adding it to the regression will produce statistically significant coefficients.

The null hypothesis: The model is correctly specified (no functional form misspecification).

The alternative hypothesis: The model is misspecified; nonlinear terms would improve the fit.

The test procedure works as follows:

  1. Estimate your original linear regression and obtain fitted values ŷ
  2. Create an augmented regression by adding ŷ², ŷ³, etc. as additional regressors
  3. Test whether these added terms are jointly significant using an F-test
  4. If significant, reject the null hypothesis of correct specification

The standard implementation uses ŷ² and ŷ³ (powers 2 and 3), though you can adjust this. A significant F-statistic indicates that the powers of fitted values add explanatory power they shouldn’t have if your model were correctly specified.

Setting Up Your Python Environment

You’ll need three core libraries for this workflow. Install them if you haven’t already:

pip install statsmodels numpy pandas

Now set up your imports and create a sample dataset that deliberately includes a nonlinear relationship—this will help demonstrate what a failing RESET test looks like:

import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.stats.diagnostic import linear_reset

# Set seed for reproducibility
np.random.seed(42)

# Generate sample data with a quadratic relationship
n = 200
x = np.random.uniform(1, 10, n)
# True relationship: y = 5 + 2*x + 0.5*x^2 + noise
y = 5 + 2 * x + 0.5 * x**2 + np.random.normal(0, 3, n)

# Create DataFrame
df = pd.DataFrame({'x': x, 'y': y})

# Also create a linear dataset for comparison
y_linear = 5 + 3 * x + np.random.normal(0, 3, n)
df_linear = pd.DataFrame({'x': x, 'y': y_linear})

print(df.head())
print(f"\nDataset shape: {df.shape}")

This gives us two datasets: one where a linear model will fail the RESET test (quadratic true relationship) and one where it should pass (linear true relationship).

Running RESET Test with Statsmodels

The linear_reset() function in statsmodels makes running the test straightforward. Let’s first fit a linear model to our quadratic data and see what happens:

# Fit a linear model to data with quadratic relationship
X = sm.add_constant(df['x'])  # Add intercept
model_misspecified = sm.OLS(df['y'], X).fit()

print("=== Model Summary (Misspecified) ===")
print(f"R-squared: {model_misspecified.rsquared:.4f}")
print(f"Coefficients: {model_misspecified.params.values}")

# Run RESET test
reset_result = linear_reset(model_misspecified, power=3, use_f=True)

print("\n=== RESET Test Results ===")
print(f"F-statistic: {reset_result.fvalue:.4f}")
print(f"p-value: {reset_result.pvalue:.6f}")
print(f"Degrees of freedom: {reset_result.df_num}, {reset_result.df_denom}")

The power=3 argument means the test includes ŷ² and ŷ³ (powers 2 through 3). Setting use_f=True returns an F-test rather than a chi-squared test.

Now compare with the correctly specified linear data:

# Fit a linear model to truly linear data
X_linear = sm.add_constant(df_linear['x'])
model_correct = sm.OLS(df_linear['y'], X_linear).fit()

print("\n=== Model Summary (Correctly Specified) ===")
print(f"R-squared: {model_correct.rsquared:.4f}")

# Run RESET test
reset_correct = linear_reset(model_correct, power=3, use_f=True)

print("\n=== RESET Test Results ===")
print(f"F-statistic: {reset_correct.fvalue:.4f}")
print(f"p-value: {reset_correct.pvalue:.6f}")

You’ll see a stark difference: the misspecified model produces a very low p-value (typically < 0.001), while the correctly specified model produces a high p-value (typically > 0.1).

Interpreting Results and Decision Making

The interpretation is straightforward but requires careful attention to what you’re actually testing:

  • Low p-value (< α): Reject the null hypothesis. Your model has functional form misspecification.
  • High p-value (≥ α): Fail to reject the null. No evidence of misspecification (but this doesn’t prove correct specification).

Common significance levels are 0.05 or 0.10. Here’s how to build this into your workflow:

def interpret_reset_test(model, alpha=0.05, power=3):
    """
    Run and interpret the RESET test for a fitted OLS model.
    
    Parameters:
    -----------
    model : statsmodels OLS results object
    alpha : significance level (default 0.05)
    power : highest power of fitted values to include (default 3)
    
    Returns:
    --------
    dict with test results and interpretation
    """
    reset_result = linear_reset(model, power=power, use_f=True)
    
    passed = reset_result.pvalue >= alpha
    
    interpretation = {
        'f_statistic': reset_result.fvalue,
        'p_value': reset_result.pvalue,
        'alpha': alpha,
        'passed': passed,
        'conclusion': None
    }
    
    if passed:
        interpretation['conclusion'] = (
            f"PASS: p-value ({reset_result.pvalue:.4f}) >= {alpha}. "
            "No evidence of functional form misspecification."
        )
    else:
        interpretation['conclusion'] = (
            f"FAIL: p-value ({reset_result.pvalue:.4f}) < {alpha}. "
            "Evidence suggests functional form misspecification. "
            "Consider adding nonlinear terms."
        )
    
    return interpretation

# Test both models
print("Misspecified model:")
result1 = interpret_reset_test(model_misspecified)
print(result1['conclusion'])

print("\nCorrectly specified model:")
result2 = interpret_reset_test(model_correct)
print(result2['conclusion'])

One critical caveat: failing to reject the null doesn’t mean your model is correctly specified. RESET has power against specific types of misspecification (omitted powers of fitted values), but it won’t catch everything. A model can pass RESET and still be wrong in other ways.

What to Do When RESET Fails

When your model fails the RESET test, you have several options. The most common fixes involve adding nonlinear terms to your regression:

# Original misspecified model
X_original = sm.add_constant(df['x'])
model_original = sm.OLS(df['y'], X_original).fit()

print("=== Original Model ===")
print(f"R-squared: {model_original.rsquared:.4f}")
reset_original = linear_reset(model_original, power=3, use_f=True)
print(f"RESET p-value: {reset_original.pvalue:.6f}")

# Fix 1: Add polynomial terms
df['x_squared'] = df['x'] ** 2
X_poly = sm.add_constant(df[['x', 'x_squared']])
model_poly = sm.OLS(df['y'], X_poly).fit()

print("\n=== Model with Quadratic Term ===")
print(f"R-squared: {model_poly.rsquared:.4f}")
reset_poly = linear_reset(model_poly, power=3, use_f=True)
print(f"RESET p-value: {reset_poly.pvalue:.6f}")

# Fix 2: Log transformation (if data allows)
# Only works when x > 0, which it is in our case
df['log_x'] = np.log(df['x'])
X_log = sm.add_constant(df['log_x'])
model_log = sm.OLS(df['y'], X_log).fit()

print("\n=== Model with Log Transform ===")
print(f"R-squared: {model_log.rsquared:.4f}")
reset_log = linear_reset(model_log, power=3, use_f=True)
print(f"RESET p-value: {reset_log.pvalue:.6f}")

Since our true data-generating process includes a quadratic term, the polynomial model should pass RESET while the log transformation might not fully capture the relationship.

Here’s a more systematic approach to model selection when RESET fails:

def compare_specifications(y, X_dict, alpha=0.05):
    """
    Compare multiple model specifications using RESET test.
    
    Parameters:
    -----------
    y : dependent variable
    X_dict : dict of {name: X_matrix} for each specification
    alpha : significance level
    
    Returns:
    --------
    DataFrame with comparison results
    """
    results = []
    
    for name, X in X_dict.items():
        model = sm.OLS(y, X).fit()
        reset = linear_reset(model, power=3, use_f=True)
        
        results.append({
            'Specification': name,
            'R-squared': model.rsquared,
            'Adj. R-squared': model.rsquared_adj,
            'AIC': model.aic,
            'RESET F-stat': reset.fvalue,
            'RESET p-value': reset.pvalue,
            'Passes RESET': reset.pvalue >= alpha
        })
    
    return pd.DataFrame(results)

# Define specifications to compare
specifications = {
    'Linear': sm.add_constant(df['x']),
    'Quadratic': sm.add_constant(df[['x', 'x_squared']]),
    'Log': sm.add_constant(df['log_x']),
}

comparison = compare_specifications(df['y'], specifications)
print(comparison.to_string(index=False))

This systematic comparison helps you choose among alternative specifications based on both fit statistics and diagnostic tests.

Conclusion

The Ramsey RESET test belongs in your standard regression diagnostic toolkit, right alongside tests for heteroskedasticity and autocorrelation. Run it after fitting any linear regression where you’re uncertain about the functional form.

Key points to remember:

  1. Run RESET early in your modeling workflow. There’s no point optimizing a misspecified model.
  2. A passing test doesn’t guarantee correct specification. RESET has specific power against omitted polynomial terms but won’t catch all forms of misspecification.
  3. When RESET fails, start simple. Try adding squared terms or log transformations before jumping to complex nonlinear models.
  4. Consider the economics/domain knowledge. Statistical tests should inform, not replace, substantive reasoning about relationships in your data.

The statsmodels implementation makes this test trivially easy to run. The harder part is knowing what to do with the results—and that requires understanding both the statistical theory and your specific modeling context.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.