How to Perform a Granger Causality Test in R

Key Insights

Granger causality tests whether past values of one time series help predict another—it measures predictive power, not true causation, so interpret results accordingly.
Your data must be stationary before testing; non-stationary series produce spurious results that will mislead your analysis entirely.
Always test causality in both directions and use information criteria (AIC/BIC) to select the optimal lag order rather than guessing.

Introduction to Granger Causality

Granger causality answers a specific question: does knowing the past values of variable X improve our predictions of variable Y beyond what Y’s own past values provide? If yes, we say X “Granger-causes” Y.

This is not causation in the philosophical sense. If stock prices Granger-cause trading volume, it doesn’t mean price movements physically cause people to trade. It means price data contains predictive information about future volume that volume’s own history doesn’t capture.

Economists use Granger causality constantly—testing whether money supply predicts GDP, whether oil prices predict inflation, or whether consumer sentiment forecasts spending. But the technique applies anywhere you have two time series and want to understand their predictive relationships.

The test works by comparing two models: a restricted model where Y depends only on its own lags, and an unrestricted model where Y depends on both its own lags and X’s lags. If adding X’s lags significantly improves the model, X Granger-causes Y.

Prerequisites and Setup

You need three packages for comprehensive Granger causality analysis in R:

# Install required packages
install.packages(c("lmtest", "vars", "tseries"))

# Load them
library(lmtest)  # For grangertest()
library(vars)    # For VAR models and causality()
library(tseries) # For stationarity tests

The lmtest package provides the simple grangertest() function for bivariate analysis. The vars package handles multivariate cases through Vector Autoregression. The tseries package gives you the Augmented Dickey-Fuller test for checking stationarity.

Stationarity is non-negotiable. A stationary series has constant mean and variance over time—it doesn’t trend upward, explode, or exhibit changing volatility. Most economic and financial time series are non-stationary in levels but become stationary after differencing.

Check stationarity with the ADF test:

# Augmented Dickey-Fuller test
# Null hypothesis: series has a unit root (non-stationary)
adf.test(your_series)

# If p-value > 0.05, the series is likely non-stationary
# You'll need to difference it

A p-value above 0.05 means you cannot reject the null hypothesis of non-stationarity. Your series needs transformation before Granger testing.

Preparing Your Data

Real-world data requires cleaning before analysis. Here’s how to prepare time series data properly:

# Load your data
data <- read.csv("economic_data.csv")

# Create time series objects
# Assuming monthly data starting January 2010
gdp <- ts(data$gdp, start = c(2010, 1), frequency = 12)
unemployment <- ts(data$unemployment, start = c(2010, 1), frequency = 12)

# Check for missing values
sum(is.na(gdp))
sum(is.na(unemployment))

# Handle missing values (linear interpolation)
library(zoo)
gdp <- na.approx(gdp)
unemployment <- na.approx(unemployment)

When your ADF test indicates non-stationarity, differencing usually solves the problem:

# First difference
gdp_diff <- diff(gdp)
unemployment_diff <- diff(unemployment)

# Verify stationarity after differencing
adf.test(gdp_diff)
adf.test(unemployment_diff)

# Sometimes you need second differencing
# gdp_diff2 <- diff(gdp, differences = 2)

First differencing transforms levels into changes—instead of GDP values, you’re analyzing GDP growth. This loses one observation but typically achieves stationarity. If first differencing isn’t enough, try second differencing, though this is rare for economic data.

Performing the Granger Causality Test

The grangertest() function from lmtest provides the simplest approach:

# Combine into a data frame
df <- data.frame(
  gdp = as.numeric(gdp_diff),
  unemployment = as.numeric(unemployment_diff)
)

# Test if unemployment Granger-causes GDP
# Null hypothesis: unemployment does NOT Granger-cause GDP
grangertest(gdp ~ unemployment, order = 4, data = df)

# Test the reverse direction
# Null hypothesis: GDP does NOT Granger-cause unemployment
grangertest(unemployment ~ gdp, order = 4, data = df)

The order parameter specifies how many lags to include. This choice matters significantly. Too few lags miss important dynamics; too many waste degrees of freedom and reduce power.

Use information criteria to select optimal lag order:

# Create a VAR object to use lag selection
var_data <- cbind(df$gdp, df$unemployment)
colnames(var_data) <- c("gdp", "unemployment")

# Select optimal lag using AIC, BIC, and other criteria
VARselect(var_data, lag.max = 12, type = "const")

The output shows recommended lags under different criteria. AIC tends to select more lags; BIC penalizes complexity more heavily. I generally trust BIC for smaller samples and AIC for larger ones.

Using the VAR Approach for Multiple Variables

When analyzing more than two variables, or when you want more sophisticated output, use the vars package:

# Prepare multivariate data
var_data <- cbind(gdp_diff, unemployment_diff)
colnames(var_data) <- c("gdp", "unemployment")

# Fit VAR model with selected lag order
var_model <- VAR(var_data, p = 4, type = "const")

# Summary of the VAR model
summary(var_model)

# Granger causality test
# Tests whether unemployment Granger-causes gdp
causality(var_model, cause = "unemployment")

# Test the reverse
causality(var_model, cause = "gdp")

The causality() function returns two tests: the Granger test and an instantaneous causality test. Focus on the Granger test results for predictive causality.

For three or more variables, the VAR approach handles joint causality:

# With three variables
var_data_3 <- cbind(gdp_diff, unemployment_diff, inflation_diff)
colnames(var_data_3) <- c("gdp", "unemployment", "inflation")

var_model_3 <- VAR(var_data_3, p = 4, type = "const")

# Test if unemployment and inflation jointly Granger-cause gdp
causality(var_model_3, cause = c("unemployment", "inflation"))

Interpreting Results

The output from grangertest() looks like this:

Granger causality test

Model 1: gdp ~ Lags(gdp, 1:4) + Lags(unemployment, 1:4)
Model 2: gdp ~ Lags(gdp, 1:4)
  Res.Df Df      F   Pr(>F)   
1     85                      
2     89 -4 3.2145 0.01642 *

The F-statistic (3.2145) tests whether the additional unemployment lags significantly improve the model. The p-value (0.01642) is below 0.05, so we reject the null hypothesis—unemployment Granger-causes GDP at the 5% significance level.

Key interpretation points:

P-value below 0.05: Reject the null. X Granger-causes Y. The past values of X contain predictive information about Y.

P-value above 0.05: Fail to reject the null. No evidence that X Granger-causes Y. This doesn’t prove X has no effect—just that we can’t detect predictive power with this data and lag structure.

Bidirectional causality: If both directions show significance, you have feedback effects. GDP predicts unemployment and unemployment predicts GDP.

Common pitfalls to avoid:

Testing non-stationary data produces meaningless results
Wrong lag selection biases your conclusions
Omitted variables can create spurious Granger causality
Statistical significance doesn’t imply economic significance

Practical Example: Complete Workflow

Here’s a complete, reproducible analysis using built-in R data:

# Load packages
library(lmtest)
library(vars)
library(tseries)

# Use built-in Canadian economic data
data(Canada, package = "vars")
head(Canada)

# Variables: e (employment), prod (productivity), rw (real wages), U (unemployment)

# Check stationarity of employment and unemployment
adf.test(Canada[, "e"])      # Employment
adf.test(Canada[, "U"])      # Unemployment

# Both likely non-stationary, so difference them
e_diff <- diff(Canada[, "e"])
U_diff <- diff(Canada[, "U"])

# Verify stationarity after differencing
adf.test(e_diff)
adf.test(U_diff)

# Combine for analysis
analysis_data <- cbind(e_diff, U_diff)
colnames(analysis_data) <- c("employment", "unemployment")

# Select optimal lag order
VARselect(analysis_data, lag.max = 8, type = "const")
# Let's say AIC suggests 2 lags

# Simple Granger test approach
df <- data.frame(
  employment = as.numeric(e_diff),
  unemployment = as.numeric(U_diff)
)

# Does unemployment Granger-cause employment?
cat("\n=== Testing: Unemployment -> Employment ===\n")
grangertest(employment ~ unemployment, order = 2, data = df)

# Does employment Granger-cause unemployment?
cat("\n=== Testing: Employment -> Unemployment ===\n")
grangertest(unemployment ~ employment, order = 2, data = df)

# VAR approach for richer output
var_model <- VAR(analysis_data, p = 2, type = "const")

cat("\n=== VAR-based Causality Tests ===\n")
causality(var_model, cause = "unemployment")
causality(var_model, cause = "employment")

# Diagnostic checks on VAR residuals
serial.test(var_model, lags.pt = 12)  # Serial correlation
normality.test(var_model)              # Normality

This workflow covers everything: stationarity testing, differencing, lag selection, bidirectional Granger tests, and model diagnostics. The serial correlation test checks whether your residuals are white noise—if they’re not, you may need more lags.

Run this code, examine the p-values, and draw your conclusions. Remember that Granger causality is about prediction, not mechanism. Finding that unemployment Granger-causes employment changes tells you unemployment data helps forecast employment—it doesn’t tell you why or through what channels. For causal mechanisms, you need economic theory and additional identification strategies.