How to Perform the Augmented Dickey-Fuller Test in R

Key Insights

The Augmented Dickey-Fuller test determines whether a time series has a unit root (non-stationary), with the null hypothesis assuming non-stationarity—a low p-value means you can proceed with stationary-based models.
The tseries package offers a quick one-liner for basic testing, while urca provides granular control over lag selection and deterministic components essential for rigorous analysis.
Always visualize your data first, choose appropriate lag lengths using information criteria, and be prepared to difference your series multiple times until achieving stationarity.

Introduction to Stationarity and the ADF Test

Stationarity is the foundation of most time series modeling. A stationary series has constant statistical properties over time—its mean, variance, and autocorrelation structure don’t depend on when you observe it. This matters because models like ARIMA, VAR, and many forecasting methods assume stationarity. Fit a model to non-stationary data, and you’ll get spurious results and unreliable predictions.

The Augmented Dickey-Fuller (ADF) test detects unit roots in time series data. A unit root indicates that shocks to the series have permanent effects—the series “remembers” past values indefinitely and wanders without reverting to a mean. This is the hallmark of non-stationarity.

The hypotheses are straightforward:

Null hypothesis (H₀): The series has a unit root (non-stationary)
Alternative hypothesis (H₁): The series is stationary

This framing means you need strong evidence (low p-value) to conclude stationarity. The test is conservative by design.

Let’s visualize the difference between stationary and non-stationary data:

set.seed(42)

# Stationary series: white noise with constant mean
stationary <- rnorm(200, mean = 10, sd = 2)

# Non-stationary series: random walk
non_stationary <- cumsum(rnorm(200))

par(mfrow = c(1, 2))
plot(stationary, type = "l", main = "Stationary Series", 
     ylab = "Value", xlab = "Time", col = "steelblue")
abline(h = mean(stationary), col = "red", lty = 2)

plot(non_stationary, type = "l", main = "Non-Stationary Series", 
     ylab = "Value", xlab = "Time", col = "darkgreen")

The stationary series oscillates around a constant mean. The non-stationary random walk drifts without any tendency to revert.

Prerequisites and R Package Setup

Three packages handle ADF testing in R, each with different strengths:

# Install packages (run once)
install.packages("tseries")
install.packages("urca")
install.packages("aTSA")

# Load packages
library(tseries)
library(urca)
library(aTSA)

tseries provides a simple, quick implementation ideal for exploratory analysis. urca offers the most control and is preferred for publication-quality work. aTSA includes additional diagnostics and automatic lag selection.

For this article, we’ll focus primarily on tseries and urca since they cover most practical use cases.

Performing the ADF Test with tseries::adf.test()

The adf.test() function from tseries is the fastest way to test for stationarity:

library(tseries)

# Test the built-in AirPassengers dataset
data("AirPassengers")
adf.test(AirPassengers)

Output:

	Augmented Dickey-Fuller Test

data:  AirPassengers
Dickey-Fuller = -7.3186, Lag order = 5, p-value = 0.01
alternative hypothesis: stationary

Wait—this says p-value = 0.01, suggesting stationarity. But AirPassengers clearly has a trend and seasonality. What’s happening?

The catch: adf.test() reports p-values capped at 0.01 and 0.99. The actual p-value here is less than 0.01, but this result is misleading because the function’s default behavior may not handle trending data appropriately for your use case.

Let’s test our simulated series for a clearer picture:

# Test the random walk (should be non-stationary)
set.seed(42)
random_walk <- cumsum(rnorm(200))
adf.test(random_walk)

	Augmented Dickey-Fuller Test

data:  random_walk
Dickey-Fuller = -2.4892, Lag order = 5, p-value = 0.3711
alternative hypothesis: stationary

A p-value of 0.37 means we fail to reject the null hypothesis. The series is non-stationary, as expected.

# Test white noise (should be stationary)
white_noise <- rnorm(200)
adf.test(white_noise)

	Augmented Dickey-Fuller Test

data:  white_noise
Dickey-Fuller = -5.1432, Lag order = 5, p-value = 0.01
alternative hypothesis: stationary

The white noise series correctly shows stationarity with p-value < 0.01.

Advanced Testing with the urca Package

The urca package’s ur.df() function provides essential control over test specification. The key parameter is type:

"none": No constant, no trend (rare in practice)
"drift": Includes a constant term
"trend": Includes both constant and linear trend

Choosing the wrong type leads to incorrect conclusions:

library(urca)

# Create a series with a deterministic trend
set.seed(42)
trend_series <- 0.5 * (1:200) + rnorm(200, sd = 5)

# Test with different specifications
test_none <- ur.df(trend_series, type = "none", lags = 4)
test_drift <- ur.df(trend_series, type = "drift", lags = 4)
test_trend <- ur.df(trend_series, type = "trend", lags = 4)

summary(test_trend)

The summary() output includes critical values at 1%, 5%, and 10% significance levels:

############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression trend 

Value of test-statistic is: -5.2847 9.3421 13.9876 

Critical values for test statistics: 
      1pct  5pct 10pct
tau3 -3.99 -3.43 -3.13
phi2  6.22  4.75  4.07
phi3  8.43  6.49  5.47

Compare the test statistic (-5.2847 for tau3) against critical values. Since -5.28 < -3.99 (1% critical value), we reject the null hypothesis at the 1% level. The series is trend-stationary.

Here’s how to interpret urca results programmatically:

# Extract test statistic and critical values
test_stat <- test_trend@teststat[1]  # tau3 statistic
critical_values <- test_trend@cval[1, ]  # Critical values for tau3

cat("Test statistic:", test_stat, "\n")
cat("Critical values (1%, 5%, 10%):", critical_values, "\n")
cat("Reject H0 at 5%?", test_stat < critical_values[2], "\n")

Choosing Lag Length and Test Parameters

Lag selection directly affects test power and validity. Too few lags leave autocorrelation in residuals, biasing results. Too many lags reduce power and waste degrees of freedom.

The urca package supports automatic lag selection using information criteria:

# Automatic lag selection with selectlags
test_aic <- ur.df(random_walk, type = "drift", lags = 10, selectlags = "AIC")
test_bic <- ur.df(random_walk, type = "drift", lags = 10, selectlags = "BIC")

cat("Lags selected by AIC:", test_aic@lags, "\n")
cat("Lags selected by BIC:", test_bic@lags, "\n")

BIC typically selects fewer lags than AIC. For most applications, BIC’s parsimony works well. Use AIC when you suspect complex autocorrelation structure.

Compare how lag choice affects results:

# Test same series with different lag specifications
for (k in c(1, 4, 8, 12)) {
  test <- ur.df(random_walk, type = "drift", lags = k)
  cat("Lags:", k, "| Test stat:", round(test@teststat[1], 3), "\n")
}

Lags: 1 | Test stat: -2.512
Lags: 4 | Test stat: -2.489
Lags: 8 | Test stat: -2.401
Lags: 12 | Test stat: -2.356

Results remain consistent here, but this isn’t always the case. When results vary substantially across lag specifications, investigate your data’s autocorrelation structure more carefully.

Interpreting Results and Common Pitfalls

Here’s a complete workflow for testing and achieving stationarity:

test_stationarity <- function(series, name = "Series") {
  test <- ur.df(series, type = "drift", selectlags = "BIC")
  stat <- test@teststat[1]
  crit_5pct <- test@cval[1, 2]
  
  is_stationary <- stat < crit_5pct
  
  cat(name, "\n")
  cat("  Test statistic:", round(stat, 3), "\n")
  cat("  5% critical value:", round(crit_5pct, 3), "\n")
  cat("  Stationary:", is_stationary, "\n\n")
  
  return(is_stationary)
}

# Test original series
data("AirPassengers")
ap <- as.numeric(AirPassengers)

# Original series
test_stationarity(ap, "Original AirPassengers")

# First difference
ap_diff1 <- diff(ap)
test_stationarity(ap_diff1, "First Difference")

# Log transform then difference (common for multiplicative seasonality)
ap_log_diff <- diff(log(ap))
test_stationarity(ap_log_diff, "Log + First Difference")

Common pitfalls to avoid:

Ignoring visual inspection. Always plot your data first. The ADF test can give misleading results with structural breaks or outliers.
Wrong trend specification. Use type = "trend" for data with obvious trends, type = "drift" for data fluctuating around a non-zero mean.
Over-differencing. If your series is already stationary, differencing introduces unnecessary complexity. Check if differencing is needed before applying it.
Ignoring seasonality. The standard ADF test doesn’t account for seasonal unit roots. For seasonal data, consider seasonal differencing or specialized tests.

# Seasonal differencing example
ap_seasonal_diff <- diff(log(ap), lag = 12)  # Remove yearly seasonality
ap_both_diff <- diff(ap_seasonal_diff)       # Then remove trend

test_stationarity(ap_both_diff, "Seasonal + First Difference")

Conclusion and Next Steps

The ADF test is your first-line tool for assessing stationarity, but use it thoughtfully:

Visualize first. Understand your data before testing.
Choose the right package. Use tseries::adf.test() for quick checks, urca::ur.df() for rigorous analysis.
Specify the model correctly. Match the type argument to your data’s characteristics.
Let data choose lags. Use AIC or BIC for automatic lag selection.
Iterate as needed. Difference and re-test until you achieve stationarity.

The ADF test isn’t perfect. Consider complementary tests: the KPSS test (tseries::kpss.test()) has stationarity as its null hypothesis, providing a useful cross-check. The Phillips-Perron test (tseries::pp.test()) handles autocorrelation differently and can confirm ADF results.

With stationarity confirmed, you’re ready to fit ARIMA models, perform Granger causality tests, or build vector autoregressions. The ADF test is the gateway—master it, and the rest of time series analysis becomes accessible.