How to Perform the Wald Test in R

The Wald test answers a fundamental question in regression analysis: is this coefficient significantly different from zero? Named after statistician Abraham Wald, this test compares the estimated...

Key Insights

  • The Wald test is already embedded in your regression output—every p-value in summary(lm()) or summary(glm()) is a Wald test for that coefficient
  • Use the aod package’s wald.test() when you need to test multiple coefficients jointly, such as testing whether a categorical variable with multiple levels is significant overall
  • The Wald test can be unreliable with small samples or when parameters are near boundary values; prefer the likelihood ratio test in these situations

Introduction to the Wald Test

The Wald test answers a fundamental question in regression analysis: is this coefficient significantly different from zero? Named after statistician Abraham Wald, this test compares the estimated parameter to its standard error to determine whether the observed effect is likely due to chance.

You use the Wald test constantly, even if you don’t realize it. Every time you look at the p-values in regression output and decide whether a predictor is “significant,” you’re interpreting Wald tests. Understanding what’s happening under the hood helps you use these tests appropriately and recognize when they might mislead you.

The Wald test shines when you need quick inference on individual coefficients or want to test whether several coefficients are jointly zero. It’s computationally cheap because it only requires fitting the full model once—unlike the likelihood ratio test, which requires fitting both full and reduced models.

Mathematical Foundation

The Wald statistic has an elegant form. For a single coefficient, it’s simply:

$$W = \frac{(\hat{\beta} - \beta_0)^2}{\text{Var}(\hat{\beta})}$$

When testing whether a coefficient equals zero (the most common case), this simplifies to the squared ratio of the estimate to its standard error. Under the null hypothesis, this statistic follows a chi-squared distribution with one degree of freedom.

For linear regression, you’ll typically see the square root of this—the t-statistic—which follows a t-distribution. For generalized linear models like logistic regression, R reports the z-statistic, treating the distribution as approximately normal.

The intuition is straightforward: if your estimate is large relative to its uncertainty, the Wald statistic is large, and the p-value is small. The test penalizes both small effect sizes and high uncertainty.

Wald Test in Linear Regression

When you fit a linear model with lm() and call summary(), R automatically performs Wald t-tests for every coefficient. Let’s see this in action:

# Simulate some data
set.seed(42)
n <- 100
x1 <- rnorm(n)
x2 <- rnorm(n)
x3 <- rnorm(n)  # This will be a noise variable
y <- 2 + 1.5*x1 - 0.8*x2 + rnorm(n, sd = 1)

# Fit the model
model_lm <- lm(y ~ x1 + x2 + x3)
summary(model_lm)
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.05237    0.09987  20.551  < 2e-16 ***
x1           1.53528    0.10424  14.729  < 2e-16 ***
x2          -0.81697    0.10032  -8.145 1.57e-12 ***
x3          -0.05765    0.10583  -0.545    0.587    

Each row contains a Wald test. The t value column shows the test statistic (estimate divided by standard error), and Pr(>|t|) is the two-sided p-value. Here, x1 and x2 are clearly significant, while x3 (our noise variable) is not—exactly as expected.

The test for the intercept asks whether the mean of y is zero when all predictors are zero. This is rarely a meaningful hypothesis, so you can usually ignore it.

Wald Test in Logistic Regression

For logistic regression via glm(), the output looks similar but uses z-statistics instead of t-statistics:

# Create binary outcome
set.seed(42)
prob <- plogis(-1 + 0.8*x1 - 0.5*x2)
y_binary <- rbinom(n, 1, prob)

# Fit logistic regression
model_glm <- glm(y_binary ~ x1 + x2 + x3, family = binomial)
summary(model_glm)
Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -0.93845    0.24012  -3.908 9.31e-05 ***
x1           0.72134    0.24856   2.902  0.00371 ** 
x2          -0.49887    0.23104  -2.159  0.03082 *  
x3           0.08234    0.21543   0.382  0.70232    

The interpretation is identical: x1 and x2 significantly predict the outcome, while x3 does not. The z-statistics follow an asymptotic normal distribution, which is why you see z value instead of t value.

One important note: these p-values test whether each coefficient equals zero on the log-odds scale. A significant coefficient means the predictor affects the odds of the outcome, but the practical importance depends on the coefficient’s magnitude.

Using the aod Package for Explicit Wald Tests

The summary() output tests each coefficient individually. But what if you want to test whether multiple coefficients are jointly zero? This is common when you have a categorical variable with several levels and want to test its overall significance.

The aod package provides wald.test() for exactly this purpose:

# Install if needed: install.packages("aod")
library(aod)

# Test whether x2 coefficient equals zero (single coefficient)
wald.test(b = coef(model_glm), Sigma = vcov(model_glm), Terms = 3)
Wald test:
----------

Chi-squared test:
X2 = 4.7, df = 1, P(> X2) = 0.031

The Terms argument specifies which coefficients to test (position in the coefficient vector, with intercept at position 1). Here we tested x2 at position 3.

More powerfully, you can test multiple coefficients simultaneously:

# Test whether x2 AND x3 are jointly zero
wald.test(b = coef(model_glm), Sigma = vcov(model_glm), Terms = c(3, 4))
Wald test:
----------

Chi-squared test:
X2 = 5.0, df = 2, P(> X2) = 0.082

This joint test asks: “Can we drop both x2 and x3 from the model without significant loss?” The chi-squared statistic has degrees of freedom equal to the number of coefficients tested.

For categorical predictors with multiple levels, this is essential. Testing each dummy variable separately doesn’t tell you whether the categorical variable as a whole matters:

# Example with categorical predictor
set.seed(42)
category <- factor(sample(c("A", "B", "C", "D"), n, replace = TRUE))
y_cat <- rbinom(n, 1, plogis(-0.5 + 0.3*(category == "B") + 
                                    0.8*(category == "C") - 
                                    0.2*(category == "D")))

model_cat <- glm(y_cat ~ category, family = binomial)
summary(model_cat)

# Test overall significance of category (coefficients 2, 3, 4)
wald.test(b = coef(model_cat), Sigma = vcov(model_cat), Terms = 2:4)

Using the car Package’s linearHypothesis()

The car package offers linearHypothesis(), which provides more flexibility for testing specific hypotheses about coefficients:

# Install if needed: install.packages("car")
library(car)

# Test whether x1 coefficient equals zero
linearHypothesis(model_glm, "x1 = 0")
Linear hypothesis test

Hypothesis:
x1 = 0

Model 1: restricted model
Model 2: y_binary ~ x1 + x2 + x3

  Res.Df Df  Chisq Pr(>Chisq)   
1     97                        
2     96  1 8.4221   0.003709 **

The real power comes from testing custom hypotheses. Want to test whether the effect of x1 equals the effect of x2?

# Test whether x1 = -x2 (effects are equal in magnitude, opposite in sign)
linearHypothesis(model_glm, "x1 + x2 = 0")

You can also test multiple hypotheses simultaneously:

# Test x1 = 0 AND x2 = 0 simultaneously
linearHypothesis(model_glm, c("x1 = 0", "x2 = 0"))

This flexibility makes linearHypothesis() ideal for testing theoretically motivated constraints on your model.

Limitations and Alternatives

The Wald test has known weaknesses you should understand.

Small sample problems: The Wald test relies on asymptotic theory—it assumes your sample is large enough for the normal approximation to hold. With small samples, especially in logistic regression, the test can be unreliable. Consider the likelihood ratio test instead:

# Likelihood ratio test alternative
model_reduced <- glm(y_binary ~ x1 + x3, family = binomial)
anova(model_reduced, model_glm, test = "LRT")

Boundary problems: When true parameter values are near the boundary of the parameter space (like probabilities near 0 or 1), Wald tests can behave poorly. The confidence intervals may not even contain plausible values.

Hauck-Donner effect: In logistic regression with complete or quasi-complete separation, the Wald statistic can actually decrease as the evidence against the null increases. This perverse behavior means you might fail to reject the null when you should.

Practical guidance: For routine significance testing with adequate sample sizes, Wald tests are fine—they’re fast and built into standard output. For small samples, important decisions, or when coefficients are large relative to their scale, verify your conclusions with a likelihood ratio test. The computational cost is minimal, and the peace of mind is worth it.

When in doubt, fit both the full and reduced models and compare them directly. If the Wald test and likelihood ratio test disagree substantially, trust the likelihood ratio test.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.