How to Perform a MANOVA in R
Multivariate Analysis of Variance (MANOVA) answers a question that regular ANOVA cannot: do groups differ across multiple dependent variables considered together? While you could run separate ANOVAs...
Key Insights
- MANOVA extends ANOVA to handle multiple dependent variables simultaneously, controlling Type I error inflation and detecting multivariate patterns that univariate tests miss
- Always check assumptions before running MANOVA—violations of multivariate normality or homogeneity of covariance matrices can invalidate your results
- Pillai’s Trace is the most robust test statistic for general use, but understanding when to use Wilks’ Lambda or other alternatives matters for accurate reporting
Introduction to MANOVA
Multivariate Analysis of Variance (MANOVA) answers a question that regular ANOVA cannot: do groups differ across multiple dependent variables considered together? While you could run separate ANOVAs for each outcome, this approach inflates your Type I error rate and ignores correlations between your dependent variables.
MANOVA handles this by analyzing all dependent variables simultaneously. It creates linear combinations of your outcomes and tests whether group centroids differ in multivariate space.
Real-world applications are everywhere. A pharmaceutical researcher might compare treatment groups across multiple biomarkers. An educational psychologist could examine how teaching methods affect both reading comprehension and mathematical reasoning. A UX researcher might analyze how interface designs impact task completion time, error rates, and satisfaction scores together.
The rule is straightforward: if you have one categorical independent variable and two or more continuous dependent variables that are conceptually related, MANOVA is your tool.
Assumptions of MANOVA
MANOVA is powerful but demanding. Violating its assumptions can produce misleading results.
Multivariate normality requires that the dependent variables follow a multivariate normal distribution within each group. This is stricter than univariate normality—each variable must be normal, and their joint distribution must be multivariate normal.
Homogeneity of covariance matrices means the variance-covariance structure should be similar across groups. This is the multivariate extension of homogeneity of variance in ANOVA.
Independence assumes observations are independent of each other. Repeated measures or clustered data violate this assumption and require different approaches.
Absence of multicollinearity means your dependent variables should be correlated (otherwise, why use MANOVA?) but not so highly correlated that they’re redundant. Correlations above 0.90 cause problems.
Here’s how to test these assumptions:
# Install and load required packages
install.packages(c("MVN", "heplots", "car"))
library(MVN)
library(heplots)
library(car)
# Using iris data as example
data(iris)
# Test multivariate normality with Mardia's test
mvn_result <- mvn(data = iris[, 1:4],
mvnTest = "mardia",
univariateTest = "SW")
print(mvn_result$multivariateNormality)
# Test homogeneity of covariance matrices with Box's M
boxM(iris[, 1:4], iris$Species)
The mvn() function from the MVN package provides Mardia’s multivariate skewness and kurtosis tests. A non-significant result (p > 0.05) suggests multivariate normality holds.
Box’s M test is notoriously sensitive with large samples. A significant result doesn’t necessarily doom your analysis—Pillai’s Trace is robust to moderate violations. However, grossly unequal covariance matrices warrant caution.
Preparing Your Data in R
MANOVA requires a specific data structure: your dependent variables must be numeric columns, and your independent variable must be a factor. Long format works best.
# Load necessary packages
library(car)
library(MASS)
library(heplots)
library(dplyr)
# Load and examine the iris dataset
data(iris)
# Check data structure
str(iris)
# 'data.frame': 150 obs. of 5 variables:
# $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species : Factor w/ 3 levels "setosa","versicolor",...
# Summary statistics by group
iris %>%
group_by(Species) %>%
summarise(
n = n(),
mean_sepal_l = mean(Sepal.Length),
mean_sepal_w = mean(Sepal.Width),
mean_petal_l = mean(Petal.Length),
mean_petal_w = mean(Petal.Width)
)
# Check for missing values
colSums(is.na(iris))
# Verify factor levels
levels(iris$Species)
Your independent variable must be a factor, not a character or numeric variable. Use as.factor() to convert if necessary. Missing values should be handled before analysis—MANOVA uses listwise deletion by default.
Running the MANOVA Test
The manova() function in base R handles the heavy lifting. The key is the cbind() function, which binds your dependent variables together on the left side of the formula.
# Run MANOVA with all four measurements as DVs
# Testing: Do iris species differ on sepal and petal measurements?
manova_model <- manova(cbind(Sepal.Length, Sepal.Width,
Petal.Length, Petal.Width) ~ Species,
data = iris)
# View the basic summary
summary(manova_model)
The output shows Pillai’s Trace by default:
Df Pillai approx F num Df den Df Pr(>F)
Species 2 1.1919 53.466 8 290 < 2.2e-16 ***
Residuals 147
This tells us that iris species differ significantly on the combined set of sepal and petal measurements (p < 0.001). But which test statistic should you trust?
Interpreting MANOVA Output
MANOVA provides four test statistics, each with different properties:
Pillai’s Trace is the most robust to violations of assumptions and works well with small or unequal sample sizes. It’s the safest default choice.
Wilks’ Lambda is the most commonly reported in published research. It represents the proportion of variance not explained by group differences.
Hotelling-Lawley Trace is powerful when group differences are concentrated on one linear combination of dependent variables.
Roy’s Largest Root examines only the first discriminant function. It’s the most powerful when this assumption holds but most affected by violations.
# Compare all four test statistics
summary(manova_model, test = "Pillai")
summary(manova_model, test = "Wilks")
summary(manova_model, test = "Hotelling-Lawley")
summary(manova_model, test = "Roy")
# Detailed output with Wilks' Lambda
summary(manova_model, test = "Wilks")
# Df Wilks approx F num Df den Df Pr(>F)
# Species 2 0.02344 199.15 8 288 < 2.2e-16 ***
For the iris data, all four tests yield highly significant results. In practice, report Pillai’s Trace unless you have theoretical reasons to prefer another. When tests disagree, investigate why—it often signals assumption violations or complex group differences.
The approximate F statistic converts the multivariate test to an F distribution for p-value calculation. The degrees of freedom depend on the number of groups, dependent variables, and sample size.
Post-Hoc Analysis
A significant MANOVA tells you groups differ somewhere, but not where. Follow-up analyses identify which dependent variables drive the effect.
# Univariate ANOVAs for each dependent variable
summary.aov(manova_model)
# Output shows separate ANOVA for each DV:
# Response Sepal.Length :
# Df Sum Sq Mean Sq F value Pr(>F)
# Species 2 63.21 31.606 119.26 < 2.2e-16 ***
# Residuals 147 38.96 0.265
#
# Response Sepal.Width :
# Df Sum Sq Mean Sq F value Pr(>F)
# Species 2 11.35 5.672 49.16 < 2.2e-16 ***
# Residuals 147 16.96 0.115
# Pairwise comparisons with Tukey's HSD
# First, run individual ANOVAs
aov_sepal_length <- aov(Sepal.Length ~ Species, data = iris)
TukeyHSD(aov_sepal_length)
# For all DVs at once
for(dv in c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")) {
cat("\n", dv, ":\n")
formula <- as.formula(paste(dv, "~ Species"))
model <- aov(formula, data = iris)
print(TukeyHSD(model))
}
The univariate ANOVAs show that all four measurements differ significantly across species. Tukey’s HSD then reveals which specific species pairs differ on each variable.
Consider applying a Bonferroni correction to your univariate tests. With four dependent variables, divide your alpha by 4 (0.05/4 = 0.0125) to maintain family-wise error control.
Reporting Results
APA-style reporting for MANOVA includes the test statistic, F value, degrees of freedom, p-value, and effect size.
# Calculate effect size (partial eta-squared)
# Using the car package for more detailed output
library(car)
# Eta-squared calculation
manova_summary <- summary(manova_model, test = "Pillai")
pillai_trace <- manova_summary$stats[1, "Pillai"]
# For visualization
library(ggplot2)
library(tidyr)
# Reshape data for plotting
iris_long <- iris %>%
pivot_longer(cols = Sepal.Length:Petal.Width,
names_to = "Measurement",
values_to = "Value")
# Create grouped boxplot
ggplot(iris_long, aes(x = Species, y = Value, fill = Species)) +
geom_boxplot() +
facet_wrap(~ Measurement, scales = "free_y") +
theme_minimal() +
labs(title = "Iris Measurements by Species",
y = "Measurement Value") +
theme(legend.position = "none")
# HE plot for visualizing multivariate effects
library(heplots)
heplot(manova_model,
fill = TRUE,
fill.alpha = 0.1,
main = "HE Plot: Species Effect on Sepal Dimensions")
A proper APA-style write-up would read:
A one-way MANOVA examined the effect of iris species on four morphological measurements (sepal length, sepal width, petal length, and petal width). There was a statistically significant difference between species on the combined dependent variables, Pillai’s Trace = 1.19, F(8, 290) = 53.47, p < .001. Follow-up univariate ANOVAs revealed significant species differences on all four measurements (all ps < .001).
The HE plot (hypothesis-error plot) from the heplots package visualizes the multivariate effect. The ellipses represent the error variance, while the hypothesis ellipse shows the between-group variance. When the hypothesis ellipse extends beyond the error ellipse, the effect is significant.
MANOVA is a gateway to understanding multivariate relationships. Master it, and you’ll see patterns that univariate thinkers miss.