How to Perform Levene's Test in R

Key Insights

Levene’s test checks whether groups have equal variances (homoscedasticity), a critical assumption for ANOVA and t-tests—use it before running these analyses
The car package’s leveneTest() function is the standard approach in R, with the center = "median" option providing robustness against non-normal data
A significant result (p < 0.05) means variances differ between groups, signaling you should use Welch’s ANOVA or data transformations instead of standard parametric tests

Introduction to Levene’s Test

Levene’s test answers a simple question: do my groups have similar variances? This matters because many statistical tests—ANOVA, t-tests, linear regression—assume homogeneity of variances (homoscedasticity). Violate this assumption, and your p-values become unreliable.

Unlike Bartlett’s test, which is highly sensitive to departures from normality, Levene’s test remains robust when your data isn’t perfectly normal. This makes it the practical choice for real-world data, which rarely follows textbook distributions.

The test works by computing the absolute deviations from each group’s center (mean or median), then running an ANOVA on these deviations. If the deviations differ significantly across groups, the variances are unequal.

Use Levene’s test when:

You’re about to run an ANOVA or t-test
You suspect groups might have different spreads
Your data may not be normally distributed

Prerequisites and Setup

The car package (Companion to Applied Regression) provides the most widely-used implementation of Levene’s test in R. Install and load it:

# Install the car package (run once)
install.packages("car")

# Load the package
library(car)

If you’re working in an environment where you can’t install packages, base R offers var.test() for two-group comparisons, but it assumes normality and only handles two groups. For practical work, stick with car.

Basic Levene’s Test with Two Groups

Let’s start with the simplest case: comparing variances between two groups. Suppose you’re comparing reaction times between a control group and a treatment group.

library(car)

# Create sample data
set.seed(42)
control <- rnorm(30, mean = 250, sd = 20)
treatment <- rnorm(30, mean = 240, sd = 35)  # Higher variance

# Combine into a data frame
reaction_data <- data.frame(
  time = c(control, treatment),
  group = factor(rep(c("Control", "Treatment"), each = 30))
)

# Run Levene's test
leveneTest(time ~ group, data = reaction_data)

Output:

Levene's Test for Homogeneity of Variance (center = median)
      Df F value   Pr(>F)   
group  1  9.6231 0.002987 **
      58

Interpreting this output:

Df: Degrees of freedom (1 for the group effect, 58 for residuals)
F value: The test statistic (9.62)—larger values indicate greater variance differences
Pr(>F): The p-value (0.003)

With p = 0.003, we reject the null hypothesis of equal variances. The treatment group has significantly different variance than the control group—which makes sense given we simulated it with sd = 35 versus sd = 20.

Levene’s Test with Multiple Groups

When preparing for ANOVA, you’ll often have three or more groups. Levene’s test handles this seamlessly.

# Simulate test scores from three teaching methods
set.seed(123)
method_a <- rnorm(25, mean = 75, sd = 8)
method_b <- rnorm(25, mean = 78, sd = 8)
method_c <- rnorm(25, mean = 72, sd = 15)  # Higher variance

scores_data <- data.frame(
  score = c(method_a, method_b, method_c),
  method = factor(rep(c("A", "B", "C"), each = 25))
)

# Run Levene's test
leveneTest(score ~ method, data = scores_data)

Output:

Levene's Test for Homogeneity of Variance (center = median)
       Df F value   Pr(>F)   
group   2  5.8934 0.004297 **
       72

The degrees of freedom now show 2 (for three groups minus one) and 72 (total observations minus number of groups). The significant p-value (0.004) tells us at least one group has different variance—Method C with its larger standard deviation is the culprit.

Choosing the Center Parameter

Levene’s test calculates deviations from a center point. The center argument controls which center:

"median" (default): Most robust to outliers and non-normal data
"mean": Original Levene’s test; use when data is approximately normal
"trimmed": Uses a 10% trimmed mean; compromise between robustness and efficiency

# Compare center options on skewed data
set.seed(456)
group1 <- rexp(30, rate = 0.1)  # Exponential (right-skewed)
group2 <- rexp(30, rate = 0.05)  # Different rate

skewed_data <- data.frame(
  value = c(group1, group2),
  group = factor(rep(c("G1", "G2"), each = 30))
)

# Using median (Brown-Forsythe test - more robust)
cat("Center = median:\n")
leveneTest(value ~ group, data = skewed_data, center = "median")

# Using mean (classical Levene's test)
cat("\nCenter = mean:\n")
leveneTest(value ~ group, data = skewed_data, center = "mean")

Output:

Center = median:
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  1  1.7842 0.1871
      58               

Center = mean:
Levene's Test for Homogeneity of Variance (center = mean)
      Df F value Pr(>F)
group  1  2.9156 0.0931
      58

Notice the different F values and p-values. With skewed data, the mean-based version can be influenced by extreme values, potentially giving misleading results. The median-based version (technically called the Brown-Forsythe test) provides more reliable inference for non-normal data.

My recommendation: Use center = "median" as your default. Switch to center = "mean" only when you’ve verified your data is approximately normal.

Interpreting Results and Next Steps

Levene’s test gives you a binary decision point:

Non-significant result (p ≥ 0.05): Variances are approximately equal. Proceed with standard ANOVA or t-test.

Significant result (p < 0.05): Variances differ significantly. You have options:

Use Welch’s ANOVA: Doesn’t assume equal variances

oneway.test(score ~ method, data = scores_data, var.equal = FALSE)

Use Welch’s t-test (for two groups): R’s default t.test() already uses this
```
t.test(time ~ group, data = reaction_data)  # Welch's by default
```

Transform your data: Log or square root transformations can stabilize variances

scores_data$log_score <- log(scores_data$score)
leveneTest(log_score ~ method, data = scores_data)

Use non-parametric alternatives: Kruskal-Wallis test doesn’t assume equal variances
```
kruskal.test(score ~ method, data = scores_data)
```

A word of caution: don’t obsess over the 0.05 threshold. If p = 0.06, your variances aren’t magically equal. Consider the practical magnitude of variance differences and your sample sizes. With large samples, even trivial variance differences become “significant.”

Complete Worked Example

Let’s walk through a realistic scenario: comparing student test scores across three schools to determine if teaching effectiveness differs.

library(car)
library(ggplot2)

# Simulate realistic test score data
set.seed(2024)

school_data <- data.frame(
  score = c(
    rnorm(45, mean = 72, sd = 12),   # School A: moderate variance
    rnorm(50, mean = 75, sd = 10),   # School B: lower variance
    rnorm(40, mean = 70, sd = 18)    # School C: higher variance
  ),
  school = factor(rep(c("Lincoln High", "Washington Prep", "Jefferson Academy"), 
                      times = c(45, 50, 40)))
)

# Step 1: Visualize the data
ggplot(school_data, aes(x = school, y = score, fill = school)) +
  geom_boxplot(alpha = 0.7) +
  geom_jitter(width = 0.2, alpha = 0.3) +
  labs(
    title = "Test Score Distribution by School",
    x = "School",
    y = "Test Score"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

# Step 2: Calculate descriptive statistics
aggregate(score ~ school, data = school_data, 
          FUN = function(x) c(mean = mean(x), sd = sd(x), var = var(x)))

# Step 3: Run Levene's test
cat("Levene's Test Results:\n")
levene_result <- leveneTest(score ~ school, data = school_data, center = "median")
print(levene_result)

# Step 4: Make a decision based on results
if (levene_result$`Pr(>F)`[1] < 0.05) {
  cat("\nVariances are significantly different (p < 0.05)")
  cat("\nUsing Welch's ANOVA instead of standard ANOVA:\n\n")
  print(oneway.test(score ~ school, data = school_data, var.equal = FALSE))
} else {
  cat("\nVariances are approximately equal (p >= 0.05)")
  cat("\nProceeding with standard ANOVA:\n\n")
  print(summary(aov(score ~ school, data = school_data)))
}

Output:

Levene's Test Results:
Levene's Test for Homogeneity of Variance (center = median)
       Df F value   Pr(>F)   
group   2  6.2847 0.002541 **
      132                    

Variances are significantly different (p < 0.05)
Using Welch's ANOVA instead of standard ANOVA:

	One-way analysis of means (not assuming equal variances)

data:  score and school
F = 1.8234, num df = 2.000, denom df = 82.416, p-value = 0.1679

The boxplot immediately reveals Jefferson Academy’s wider spread. Levene’s test confirms this statistically (p = 0.003). Because variances differ, we use Welch’s ANOVA, which accounts for heterogeneity. The final result (p = 0.168) suggests no significant difference in mean scores across schools—but we can trust this conclusion because we used the appropriate test.

This workflow—visualize, test assumptions, adapt your analysis—should become second nature. Levene’s test isn’t just a box to check; it’s a decision point that determines which downstream analysis gives you valid results.

Introduction to Levene’s Test

Prerequisites and Setup

Basic Levene’s Test with Two Groups

Levene’s Test with Multiple Groups

Choosing the Center Parameter

Interpreting Results and Next Steps

Complete Worked Example

Liked this? There's more.

Similar Articles