How to Perform Levene's Test in R
Levene's test answers a simple question: do my groups have similar variances? This matters because many statistical tests—ANOVA, t-tests, linear regression—assume homogeneity of variances...
Key Insights
- Levene’s test checks whether groups have equal variances (homoscedasticity), a critical assumption for ANOVA and t-tests—use it before running these analyses
- The
carpackage’sleveneTest()function is the standard approach in R, with thecenter = "median"option providing robustness against non-normal data - A significant result (p < 0.05) means variances differ between groups, signaling you should use Welch’s ANOVA or data transformations instead of standard parametric tests
Introduction to Levene’s Test
Levene’s test answers a simple question: do my groups have similar variances? This matters because many statistical tests—ANOVA, t-tests, linear regression—assume homogeneity of variances (homoscedasticity). Violate this assumption, and your p-values become unreliable.
Unlike Bartlett’s test, which is highly sensitive to departures from normality, Levene’s test remains robust when your data isn’t perfectly normal. This makes it the practical choice for real-world data, which rarely follows textbook distributions.
The test works by computing the absolute deviations from each group’s center (mean or median), then running an ANOVA on these deviations. If the deviations differ significantly across groups, the variances are unequal.
Use Levene’s test when:
- You’re about to run an ANOVA or t-test
- You suspect groups might have different spreads
- Your data may not be normally distributed
Prerequisites and Setup
The car package (Companion to Applied Regression) provides the most widely-used implementation of Levene’s test in R. Install and load it:
# Install the car package (run once)
install.packages("car")
# Load the package
library(car)
If you’re working in an environment where you can’t install packages, base R offers var.test() for two-group comparisons, but it assumes normality and only handles two groups. For practical work, stick with car.
Basic Levene’s Test with Two Groups
Let’s start with the simplest case: comparing variances between two groups. Suppose you’re comparing reaction times between a control group and a treatment group.
library(car)
# Create sample data
set.seed(42)
control <- rnorm(30, mean = 250, sd = 20)
treatment <- rnorm(30, mean = 240, sd = 35) # Higher variance
# Combine into a data frame
reaction_data <- data.frame(
time = c(control, treatment),
group = factor(rep(c("Control", "Treatment"), each = 30))
)
# Run Levene's test
leveneTest(time ~ group, data = reaction_data)
Output:
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 9.6231 0.002987 **
58
Interpreting this output:
- Df: Degrees of freedom (1 for the group effect, 58 for residuals)
- F value: The test statistic (9.62)—larger values indicate greater variance differences
- Pr(>F): The p-value (0.003)
With p = 0.003, we reject the null hypothesis of equal variances. The treatment group has significantly different variance than the control group—which makes sense given we simulated it with sd = 35 versus sd = 20.
Levene’s Test with Multiple Groups
When preparing for ANOVA, you’ll often have three or more groups. Levene’s test handles this seamlessly.
# Simulate test scores from three teaching methods
set.seed(123)
method_a <- rnorm(25, mean = 75, sd = 8)
method_b <- rnorm(25, mean = 78, sd = 8)
method_c <- rnorm(25, mean = 72, sd = 15) # Higher variance
scores_data <- data.frame(
score = c(method_a, method_b, method_c),
method = factor(rep(c("A", "B", "C"), each = 25))
)
# Run Levene's test
leveneTest(score ~ method, data = scores_data)
Output:
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 2 5.8934 0.004297 **
72
The degrees of freedom now show 2 (for three groups minus one) and 72 (total observations minus number of groups). The significant p-value (0.004) tells us at least one group has different variance—Method C with its larger standard deviation is the culprit.
Choosing the Center Parameter
Levene’s test calculates deviations from a center point. The center argument controls which center:
"median"(default): Most robust to outliers and non-normal data"mean": Original Levene’s test; use when data is approximately normal"trimmed": Uses a 10% trimmed mean; compromise between robustness and efficiency
# Compare center options on skewed data
set.seed(456)
group1 <- rexp(30, rate = 0.1) # Exponential (right-skewed)
group2 <- rexp(30, rate = 0.05) # Different rate
skewed_data <- data.frame(
value = c(group1, group2),
group = factor(rep(c("G1", "G2"), each = 30))
)
# Using median (Brown-Forsythe test - more robust)
cat("Center = median:\n")
leveneTest(value ~ group, data = skewed_data, center = "median")
# Using mean (classical Levene's test)
cat("\nCenter = mean:\n")
leveneTest(value ~ group, data = skewed_data, center = "mean")
Output:
Center = median:
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 1.7842 0.1871
58
Center = mean:
Levene's Test for Homogeneity of Variance (center = mean)
Df F value Pr(>F)
group 1 2.9156 0.0931
58
Notice the different F values and p-values. With skewed data, the mean-based version can be influenced by extreme values, potentially giving misleading results. The median-based version (technically called the Brown-Forsythe test) provides more reliable inference for non-normal data.
My recommendation: Use center = "median" as your default. Switch to center = "mean" only when you’ve verified your data is approximately normal.
Interpreting Results and Next Steps
Levene’s test gives you a binary decision point:
Non-significant result (p ≥ 0.05): Variances are approximately equal. Proceed with standard ANOVA or t-test.
Significant result (p < 0.05): Variances differ significantly. You have options:
-
Use Welch’s ANOVA: Doesn’t assume equal variances
oneway.test(score ~ method, data = scores_data, var.equal = FALSE) -
Use Welch’s t-test (for two groups): R’s default
t.test()already uses thist.test(time ~ group, data = reaction_data) # Welch's by default -
Transform your data: Log or square root transformations can stabilize variances
scores_data$log_score <- log(scores_data$score) leveneTest(log_score ~ method, data = scores_data) -
Use non-parametric alternatives: Kruskal-Wallis test doesn’t assume equal variances
kruskal.test(score ~ method, data = scores_data)
A word of caution: don’t obsess over the 0.05 threshold. If p = 0.06, your variances aren’t magically equal. Consider the practical magnitude of variance differences and your sample sizes. With large samples, even trivial variance differences become “significant.”
Complete Worked Example
Let’s walk through a realistic scenario: comparing student test scores across three schools to determine if teaching effectiveness differs.
library(car)
library(ggplot2)
# Simulate realistic test score data
set.seed(2024)
school_data <- data.frame(
score = c(
rnorm(45, mean = 72, sd = 12), # School A: moderate variance
rnorm(50, mean = 75, sd = 10), # School B: lower variance
rnorm(40, mean = 70, sd = 18) # School C: higher variance
),
school = factor(rep(c("Lincoln High", "Washington Prep", "Jefferson Academy"),
times = c(45, 50, 40)))
)
# Step 1: Visualize the data
ggplot(school_data, aes(x = school, y = score, fill = school)) +
geom_boxplot(alpha = 0.7) +
geom_jitter(width = 0.2, alpha = 0.3) +
labs(
title = "Test Score Distribution by School",
x = "School",
y = "Test Score"
) +
theme_minimal() +
theme(legend.position = "none")
# Step 2: Calculate descriptive statistics
aggregate(score ~ school, data = school_data,
FUN = function(x) c(mean = mean(x), sd = sd(x), var = var(x)))
# Step 3: Run Levene's test
cat("Levene's Test Results:\n")
levene_result <- leveneTest(score ~ school, data = school_data, center = "median")
print(levene_result)
# Step 4: Make a decision based on results
if (levene_result$`Pr(>F)`[1] < 0.05) {
cat("\nVariances are significantly different (p < 0.05)")
cat("\nUsing Welch's ANOVA instead of standard ANOVA:\n\n")
print(oneway.test(score ~ school, data = school_data, var.equal = FALSE))
} else {
cat("\nVariances are approximately equal (p >= 0.05)")
cat("\nProceeding with standard ANOVA:\n\n")
print(summary(aov(score ~ school, data = school_data)))
}
Output:
Levene's Test Results:
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 2 6.2847 0.002541 **
132
Variances are significantly different (p < 0.05)
Using Welch's ANOVA instead of standard ANOVA:
One-way analysis of means (not assuming equal variances)
data: score and school
F = 1.8234, num df = 2.000, denom df = 82.416, p-value = 0.1679
The boxplot immediately reveals Jefferson Academy’s wider spread. Levene’s test confirms this statistically (p = 0.003). Because variances differ, we use Welch’s ANOVA, which accounts for heterogeneity. The final result (p = 0.168) suggests no significant difference in mean scores across schools—but we can trust this conclusion because we used the appropriate test.
This workflow—visualize, test assumptions, adapt your analysis—should become second nature. Levene’s test isn’t just a box to check; it’s a decision point that determines which downstream analysis gives you valid results.