How to Perform Dunnett's Test in R
When you run an experiment with a control group and multiple treatment conditions, you often don't care about comparing treatments to each other. You want to know which treatments differ from the...
Key Insights
- Dunnett’s test is specifically designed for comparing multiple treatment groups against a single control, offering more statistical power than general pairwise comparisons like Tukey’s HSD when that’s your research question.
- The
multcomppackage’sglht()function provides the most flexible implementation, whileDescTools::DunnettTest()offers a simpler interface for straightforward analyses. - Always verify ANOVA assumptions before running Dunnett’s test—violations of normality or homogeneity of variance can invalidate your results and lead to incorrect conclusions.
Introduction to Dunnett’s Test
When you run an experiment with a control group and multiple treatment conditions, you often don’t care about comparing treatments to each other. You want to know which treatments differ from the control. This is exactly what Dunnett’s test does.
Dunnett’s test is a post-hoc multiple comparison procedure designed for comparing several treatment means against a single control mean. Unlike Tukey’s Honest Significant Difference (HSD), which compares all possible pairs, Dunnett’s test focuses exclusively on treatment-versus-control comparisons. This focused approach gives you more statistical power because you’re making fewer comparisons and therefore applying a less severe correction for multiple testing.
Use Dunnett’s test when:
- You have a clear control group (placebo, baseline, or standard treatment)
- You only care about treatment-vs-control comparisons
- You want to maximize power for detecting differences from control
Use Tukey’s HSD or similar methods when:
- All pairwise comparisons are scientifically meaningful
- There’s no designated control group
- You need to compare treatments to each other
Prerequisites and Setup
R offers two main packages for Dunnett’s test: multcomp and DescTools. The multcomp package provides a general framework for multiple comparisons and is more flexible, while DescTools offers a simpler, more direct function.
# Install packages if needed
install.packages("multcomp")
install.packages("DescTools")
# Load libraries
library(multcomp)
library(DescTools)
# For visualization
library(ggplot2)
The multcomp package uses a general linear hypothesis framework, which means you can apply it to various model types beyond simple one-way ANOVA. This flexibility comes at the cost of slightly more complex syntax.
Preparing Your Data
Dunnett’s test requires your data in a specific format. The grouping variable must be a factor, and critically, the control group must be the first level of that factor. R orders factor levels alphabetically by default, which often puts “Control” first—but don’t rely on this assumption.
Let’s create a sample dataset simulating a drug trial with one control group and three treatment groups:
# Set seed for reproducibility
set.seed(42)
# Create sample data: drug efficacy study
# Control group + 3 treatment doses
n_per_group <- 20
drug_data <- data.frame(
group = factor(rep(c("Control", "Low_Dose", "Medium_Dose", "High_Dose"),
each = n_per_group)),
response = c(
rnorm(n_per_group, mean = 50, sd = 8), # Control
rnorm(n_per_group, mean = 53, sd = 8), # Low dose - slight effect
rnorm(n_per_group, mean = 58, sd = 8), # Medium dose - moderate effect
rnorm(n_per_group, mean = 62, sd = 8) # High dose - strong effect
)
)
# Verify control is the reference level
levels(drug_data$group)
# [1] "Control" "High_Dose" "Low_Dose" "Medium_Dose"
Notice that alphabetical ordering put “Control” first, but “High_Dose” comes before “Low_Dose” and “Medium_Dose”. If your control group isn’t first, explicitly set it:
# Explicitly set control as reference level
drug_data$group <- relevel(drug_data$group, ref = "Control")
# Or reorder all levels explicitly
drug_data$group <- factor(drug_data$group,
levels = c("Control", "Low_Dose",
"Medium_Dose", "High_Dose"))
# Verify the order
levels(drug_data$group)
# [1] "Control" "Low_Dose" "Medium_Dose" "High_Dose"
Quick summary statistics help verify your data looks reasonable:
# Summary by group
aggregate(response ~ group, data = drug_data,
FUN = function(x) c(mean = mean(x), sd = sd(x), n = length(x)))
Running ANOVA First
Dunnett’s test is a post-hoc procedure, meaning you should first establish that significant differences exist among groups using ANOVA. Running post-hoc tests without a significant omnibus test inflates your Type I error rate.
# Fit one-way ANOVA
anova_model <- aov(response ~ group, data = drug_data)
# View ANOVA results
summary(anova_model)
Df Sum Sq Mean Sq F value Pr(>F)
group 3 1847 615.8 9.876 1.23e-05 ***
Residuals 76 4740 62.4
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
With a p-value well below 0.05, we have evidence that at least one group mean differs from the others. Now we can proceed with Dunnett’s test to identify which specific treatments differ from the control.
Store the model object—you’ll pass it directly to the Dunnett’s test functions.
Performing Dunnett’s Test
Method 1: Using multcomp
The multcomp package uses the glht() (general linear hypotheses) function combined with mcp() (multiple comparison procedure) to specify Dunnett contrasts:
# Perform Dunnett's test using multcomp
dunnett_result <- glht(anova_model, linfct = mcp(group = "Dunnett"))
# View summary with adjusted p-values
summary(dunnett_result)
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Dunnett Contrasts
Fit: aov(formula = response ~ group, data = drug_data)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
Low_Dose - Control == 0 2.847 2.497 1.140 0.48892
Medium_Dose - Control == 0 8.234 2.497 3.297 0.00432 **
High_Dose - Control == 0 12.156 2.497 4.868 < 0.001 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
The output shows each treatment compared to the control. The “Estimate” column shows the difference in means (treatment minus control). Adjusted p-values account for multiple comparisons.
Get confidence intervals for the differences:
# Confidence intervals
confint(dunnett_result)
Simultaneous Confidence Intervals
Multiple Comparisons of Means: Dunnett Contrasts
Fit: aov(formula = response ~ group, data = drug_data)
Quantile = 2.394
95% family-wise confidence level
Linear Hypotheses:
Estimate lwr upr
Low_Dose - Control == 0 2.847 -3.131 8.825
Medium_Dose - Control == 0 8.234 2.256 14.212
High_Dose - Control == 0 12.156 6.178 18.134
Intervals that don’t include zero indicate significant differences from control.
Method 2: Using DescTools
For a simpler interface, DescTools provides DunnettTest():
# Perform Dunnett's test using DescTools
DunnettTest(response ~ group, data = drug_data)
Dunnett's test for comparing several treatments with a control :
95% family-wise confidence level
$Control
diff lwr.ci upr.ci pval
Low_Dose-Control 2.847 -3.131 8.825 0.4889
Medium_Dose-Control 8.234 2.256 14.212 0.0043 **
High_Dose-Control 12.156 6.178 18.134 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The DescTools output is more compact and directly shows confidence intervals alongside p-values. Both methods produce identical statistical results—choose based on your workflow preferences.
You can specify a different control group if needed:
# Specify control group explicitly
DunnettTest(response ~ group, data = drug_data, control = "Control")
Interpreting and Visualizing Results
From our results, we can conclude:
- Low Dose vs Control: Not significant (p = 0.489). The confidence interval includes zero (-3.13 to 8.83).
- Medium Dose vs Control: Significant (p = 0.004). Mean response is 8.2 units higher than control.
- High Dose vs Control: Highly significant (p < 0.001). Mean response is 12.2 units higher than control.
Visualizing confidence intervals makes these results immediately interpretable:
# Plot confidence intervals from multcomp
plot(confint(dunnett_result))
For publication-quality graphics, use ggplot2:
# Extract results for ggplot
dunnett_summary <- as.data.frame(confint(dunnett_result)$confint)
dunnett_summary$comparison <- rownames(dunnett_summary)
# Clean up comparison names
dunnett_summary$comparison <- gsub(" - Control", "", dunnett_summary$comparison)
# Create forest plot
ggplot(dunnett_summary, aes(x = Estimate, y = comparison)) +
geom_point(size = 3) +
geom_errorbarh(aes(xmin = lwr, xmax = upr), height = 0.2) +
geom_vline(xintercept = 0, linetype = "dashed", color = "red") +
labs(
x = "Difference from Control (95% CI)",
y = "Treatment",
title = "Dunnett's Test: Treatment Effects vs Control"
) +
theme_minimal() +
theme(
axis.text.y = element_text(size = 11),
plot.title = element_text(hjust = 0.5)
)
The red dashed line at zero represents no difference from control. Confidence intervals crossing this line indicate non-significant differences.
Conclusion and Best Practices
Before trusting your Dunnett’s test results, verify ANOVA assumptions:
# Check normality of residuals
shapiro.test(residuals(anova_model))
# Check homogeneity of variance
bartlett.test(response ~ group, data = drug_data)
# Visual diagnostics
par(mfrow = c(2, 2))
plot(anova_model)
If assumptions are violated, consider these alternatives:
- Non-normal data: Use Dunn’s test with Holm adjustment after Kruskal-Wallis
- Unequal variances: Use Games-Howell test or Welch’s ANOVA followed by pairwise t-tests with appropriate corrections
- Small samples: Consider permutation-based approaches
Common pitfalls to avoid:
- Wrong reference level: Always verify your control is the first factor level
- Skipping ANOVA: Post-hoc tests without a significant omnibus test inflate false positives
- One-sided vs two-sided: Default is two-sided; specify
alternative = "greater"oralternative = "less"inglht()if you have directional hypotheses - Ignoring effect sizes: Statistical significance doesn’t equal practical importance—report confidence intervals and consider effect size measures
Dunnett’s test remains the gold standard for treatment-versus-control comparisons. Use it when your experimental design calls for it, verify your assumptions, and report both p-values and confidence intervals for complete transparency.