How to Calculate the Durbin-Watson Statistic in R
When you fit a linear regression model, you assume that your residuals are independent of each other. This assumption frequently breaks down with time-series data or any dataset where observations...
Key Insights
- The Durbin-Watson statistic ranges from 0 to 4, where values near 2 indicate no autocorrelation, values below 2 suggest positive autocorrelation, and values above 2 indicate negative autocorrelation in regression residuals.
- R provides two primary packages for calculating the DW statistic:
lmtestoffers the straightforwarddwtest()function, whilecarprovidesdurbinWatsonTest()with bootstrap capabilities for more robust p-values. - Detecting autocorrelation is only the first step—when present, you must address it through generalized least squares, Newey-West standard errors, or by adding lagged variables to your model.
Introduction to the Durbin-Watson Statistic
When you fit a linear regression model, you assume that your residuals are independent of each other. This assumption frequently breaks down with time-series data or any dataset where observations have a natural ordering. Autocorrelation—the correlation of residuals with their own lagged values—violates this independence assumption and wreaks havoc on your inference.
The Durbin-Watson (DW) statistic tests specifically for first-order autocorrelation in regression residuals. It measures whether consecutive residuals are correlated with each other. The statistic ranges from 0 to 4:
- DW ≈ 2: No autocorrelation present
- DW < 2: Positive autocorrelation (residuals tend to be followed by residuals of the same sign)
- DW > 2: Negative autocorrelation (residuals tend to be followed by residuals of opposite sign)
You should run this test whenever you’re working with time-series data, panel data with a time component, or any regression where observation order matters. Ignoring autocorrelation leads to underestimated standard errors, inflated t-statistics, and false confidence in your results.
Prerequisites and Setup
R doesn’t include the Durbin-Watson test in its base installation. You need either the lmtest package or the car package—I recommend installing both since they offer slightly different functionality.
# Install packages if you haven't already
install.packages("lmtest")
install.packages("car")
# Load the packages
library(lmtest)
library(car)
The lmtest package is lightweight and focused on diagnostic tests for linear models. The car package (Companion to Applied Regression) is heavier but provides additional options like bootstrapped p-values. For quick diagnostics, lmtest is sufficient. For publication-quality analysis, the bootstrap option in car adds robustness.
Creating a Sample Dataset
Let’s create a realistic scenario: modeling quarterly sales data where we suspect temporal patterns might create autocorrelated residuals. We’ll generate data with intentional autocorrelation so you can see how the test detects it.
# Set seed for reproducibility
set.seed(42)
# Generate time index (40 quarters = 10 years)
n <- 40
time <- 1:n
# Create predictor variables
advertising <- rnorm(n, mean = 100, sd = 20)
price <- rnorm(n, mean = 50, sd = 10)
# Generate autocorrelated errors (AR(1) process with phi = 0.7)
errors <- numeric(n)
errors[1] <- rnorm(1)
for (i in 2:n) {
errors[i] <- 0.7 * errors[i-1] + rnorm(1)
}
# Create sales variable with autocorrelated errors
sales <- 200 + 2.5 * advertising - 1.5 * price + 10 * errors
# Combine into data frame
sales_data <- data.frame(
quarter = time,
sales = sales,
advertising = advertising,
price = price
)
# Fit the linear model
model <- lm(sales ~ advertising + price, data = sales_data)
# Check model summary
summary(model)
This code creates a dataset where sales depend on advertising spend and price, but the error terms follow an AR(1) process with a correlation coefficient of 0.7. This simulates a common real-world scenario where unobserved factors affecting sales persist across time periods.
For comparison, let’s also create a model without autocorrelation:
# Generate data WITHOUT autocorrelation
set.seed(123)
independent_errors <- rnorm(n)
sales_clean <- 200 + 2.5 * advertising - 1.5 * price + 10 * independent_errors
clean_data <- data.frame(
quarter = time,
sales = sales_clean,
advertising = advertising,
price = price
)
model_clean <- lm(sales ~ advertising + price, data = clean_data)
Calculating DW Statistic Using the lmtest Package
The lmtest package provides the most straightforward approach with its dwtest() function. Pass your fitted model object, and it returns the test statistic and p-value.
# Run Durbin-Watson test on the autocorrelated model
dw_result <- dwtest(model)
print(dw_result)
The output looks like this:
Durbin-Watson test
data: model
DW = 0.89543, p-value = 1.259e-05
alternative hypothesis: true autocorrelation is greater than 0
A DW statistic of 0.895 is substantially below 2, indicating positive autocorrelation. The tiny p-value (1.259e-05) confirms this is statistically significant—we reject the null hypothesis of no autocorrelation.
Now compare with the clean model:
# Run test on model without autocorrelation
dw_clean <- dwtest(model_clean)
print(dw_clean)
Durbin-Watson test
data: model_clean
DW = 2.1847, p-value = 0.6521
alternative hypothesis: true autocorrelation is greater than 0
The DW statistic of 2.18 is close to 2, and the high p-value (0.65) means we cannot reject the null hypothesis. No evidence of autocorrelation here.
By default, dwtest() tests for positive autocorrelation (the more common case). You can test for negative autocorrelation or run a two-sided test:
# Test for negative autocorrelation
dwtest(model, alternative = "less")
# Two-sided test (either positive or negative)
dwtest(model, alternative = "two.sided")
Alternative Method: The car Package
The car package’s durbinWatsonTest() function offers additional flexibility, particularly its bootstrap option for computing p-values. This matters because the standard DW test assumes normally distributed errors—an assumption that doesn’t always hold.
# Basic Durbin-Watson test using car package
dw_car <- durbinWatsonTest(model)
print(dw_car)
lag Autocorrelation D-W Statistic p-value
1 0.5284671 0.8954334 0
Alternative hypothesis: rho != 0
Notice that car reports the estimated autocorrelation coefficient (0.528) alongside the DW statistic. This gives you a direct measure of the correlation strength. The function also defaults to a two-sided test.
The real advantage of car is its bootstrap option:
# Bootstrap p-value (more robust, especially with non-normal errors)
set.seed(42)
dw_bootstrap <- durbinWatsonTest(model, simulate = TRUE, reps = 1000)
print(dw_bootstrap)
lag Autocorrelation D-W Statistic p-value
1 0.5284671 0.8954334 0.002
Alternative hypothesis: rho != 0
The bootstrap resamples residuals to compute an empirical p-value distribution. With 1000 replications, you get a more reliable p-value that doesn’t depend on distributional assumptions. For serious analysis, I recommend using at least 1000 bootstrap replications—5000 for publication.
Interpreting Results and Next Steps
Understanding your DW statistic is only half the battle. Here’s a practical decision framework:
| DW Value | Interpretation | Action Required |
|---|---|---|
| 1.5 - 2.5 | Likely no significant autocorrelation | Proceed with standard inference |
| < 1.5 | Positive autocorrelation suspected | Run formal test, consider corrections |
| > 2.5 | Negative autocorrelation suspected | Run formal test, consider corrections |
| < 1.0 or > 3.0 | Strong autocorrelation | Corrections definitely needed |
When you detect autocorrelation, visualize it with the autocorrelation function:
# Plot residual autocorrelation
par(mfrow = c(1, 2))
# ACF plot for autocorrelated model
acf(residuals(model), main = "ACF: Autocorrelated Model")
# ACF plot for clean model
acf(residuals(model_clean), main = "ACF: Clean Model")
par(mfrow = c(1, 1))
The ACF plot shows correlation at each lag. Significant spikes beyond the confidence bands (dashed lines) indicate autocorrelation at that lag. For the autocorrelated model, you’ll see a gradually decaying pattern typical of AR(1) processes.
When autocorrelation is present, you have several remediation options:
Option 1: Newey-West Standard Errors
# Install and load sandwich package for robust SEs
library(sandwich)
library(lmtest)
# Compute Newey-West robust standard errors
coeftest(model, vcov = NeweyWest(model))
This adjusts your standard errors to account for autocorrelation without changing coefficient estimates. Use this when you primarily care about inference.
Option 2: Generalized Least Squares
library(nlme)
# Fit GLS model with AR(1) correlation structure
gls_model <- gls(sales ~ advertising + price,
data = sales_data,
correlation = corAR1(form = ~ quarter))
summary(gls_model)
GLS explicitly models the correlation structure and produces more efficient estimates.
Option 3: Add Lagged Variables
# Add lagged dependent variable
sales_data$lag_sales <- c(NA, sales_data$sales[-n])
model_lag <- lm(sales ~ advertising + price + lag_sales,
data = sales_data, na.action = na.exclude)
# Check if autocorrelation is resolved
dwtest(model_lag)
Adding a lagged dependent variable often absorbs the autocorrelation, though it changes the interpretation of your model.
Conclusion
The Durbin-Watson test is a fundamental diagnostic for any regression involving ordered data. In R, you have two solid options: use dwtest() from lmtest for quick, straightforward testing, or use durbinWatsonTest() from car when you need bootstrap p-values or want the autocorrelation coefficient reported directly.
Always test for autocorrelation before trusting your regression inference on time-series or panel data. When you find it, don’t ignore it—apply Newey-West standard errors, fit a GLS model, or restructure your specification with lagged variables. The DW test takes seconds to run; the consequences of ignoring autocorrelation can invalidate your entire analysis.