How to Perform Dunnett's Test in R

Key Insights

Dunnett’s test is specifically designed for comparing multiple treatment groups against a single control, offering more statistical power than general pairwise comparisons like Tukey’s HSD when that’s your research question.
The multcomp package’s glht() function provides the most flexible implementation, while DescTools::DunnettTest() offers a simpler interface for straightforward analyses.
Always verify ANOVA assumptions before running Dunnett’s test—violations of normality or homogeneity of variance can invalidate your results and lead to incorrect conclusions.

Introduction to Dunnett’s Test

When you run an experiment with a control group and multiple treatment conditions, you often don’t care about comparing treatments to each other. You want to know which treatments differ from the control. This is exactly what Dunnett’s test does.

Dunnett’s test is a post-hoc multiple comparison procedure designed for comparing several treatment means against a single control mean. Unlike Tukey’s Honest Significant Difference (HSD), which compares all possible pairs, Dunnett’s test focuses exclusively on treatment-versus-control comparisons. This focused approach gives you more statistical power because you’re making fewer comparisons and therefore applying a less severe correction for multiple testing.

Use Dunnett’s test when:

You have a clear control group (placebo, baseline, or standard treatment)
You only care about treatment-vs-control comparisons
You want to maximize power for detecting differences from control

Use Tukey’s HSD or similar methods when:

All pairwise comparisons are scientifically meaningful
There’s no designated control group
You need to compare treatments to each other

Prerequisites and Setup

R offers two main packages for Dunnett’s test: multcomp and DescTools. The multcomp package provides a general framework for multiple comparisons and is more flexible, while DescTools offers a simpler, more direct function.

# Install packages if needed
install.packages("multcomp")
install.packages("DescTools")

# Load libraries
library(multcomp)
library(DescTools)

# For visualization
library(ggplot2)

The multcomp package uses a general linear hypothesis framework, which means you can apply it to various model types beyond simple one-way ANOVA. This flexibility comes at the cost of slightly more complex syntax.

Preparing Your Data

Dunnett’s test requires your data in a specific format. The grouping variable must be a factor, and critically, the control group must be the first level of that factor. R orders factor levels alphabetically by default, which often puts “Control” first—but don’t rely on this assumption.

Let’s create a sample dataset simulating a drug trial with one control group and three treatment groups:

# Set seed for reproducibility
set.seed(42)

# Create sample data: drug efficacy study
# Control group + 3 treatment doses
n_per_group <- 20

drug_data <- data.frame(
  group = factor(rep(c("Control", "Low_Dose", "Medium_Dose", "High_Dose"), 
                     each = n_per_group)),
  response = c(
    rnorm(n_per_group, mean = 50, sd = 8),   # Control
    rnorm(n_per_group, mean = 53, sd = 8),   # Low dose - slight effect
    rnorm(n_per_group, mean = 58, sd = 8),   # Medium dose - moderate effect
    rnorm(n_per_group, mean = 62, sd = 8)    # High dose - strong effect
  )
)

# Verify control is the reference level
levels(drug_data$group)
# [1] "Control"     "High_Dose"   "Low_Dose"    "Medium_Dose"

Notice that alphabetical ordering put “Control” first, but “High_Dose” comes before “Low_Dose” and “Medium_Dose”. If your control group isn’t first, explicitly set it:

# Explicitly set control as reference level
drug_data$group <- relevel(drug_data$group, ref = "Control")

# Or reorder all levels explicitly
drug_data$group <- factor(drug_data$group, 
                          levels = c("Control", "Low_Dose", 
                                    "Medium_Dose", "High_Dose"))

# Verify the order
levels(drug_data$group)
# [1] "Control"     "Low_Dose"    "Medium_Dose" "High_Dose"

Quick summary statistics help verify your data looks reasonable:

# Summary by group
aggregate(response ~ group, data = drug_data, 
          FUN = function(x) c(mean = mean(x), sd = sd(x), n = length(x)))

Running ANOVA First

Dunnett’s test is a post-hoc procedure, meaning you should first establish that significant differences exist among groups using ANOVA. Running post-hoc tests without a significant omnibus test inflates your Type I error rate.

# Fit one-way ANOVA
anova_model <- aov(response ~ group, data = drug_data)

# View ANOVA results
summary(anova_model)

            Df Sum Sq Mean Sq F value   Pr(>F)    
group        3   1847   615.8   9.876 1.23e-05 ***
Residuals   76   4740    62.4                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

With a p-value well below 0.05, we have evidence that at least one group mean differs from the others. Now we can proceed with Dunnett’s test to identify which specific treatments differ from the control.

Store the model object—you’ll pass it directly to the Dunnett’s test functions.

Performing Dunnett’s Test

Method 1: Using multcomp

The multcomp package uses the glht() (general linear hypotheses) function combined with mcp() (multiple comparison procedure) to specify Dunnett contrasts:

# Perform Dunnett's test using multcomp
dunnett_result <- glht(anova_model, linfct = mcp(group = "Dunnett"))

# View summary with adjusted p-values
summary(dunnett_result)

	 Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Dunnett Contrasts

Fit: aov(formula = response ~ group, data = drug_data)

Linear Hypotheses:
                         Estimate Std. Error t value Pr(>|t|)    
Low_Dose - Control == 0     2.847      2.497   1.140  0.48892    
Medium_Dose - Control == 0  8.234      2.497   3.297  0.00432 ** 
High_Dose - Control == 0   12.156      2.497   4.868  < 0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

The output shows each treatment compared to the control. The “Estimate” column shows the difference in means (treatment minus control). Adjusted p-values account for multiple comparisons.

Get confidence intervals for the differences:

# Confidence intervals
confint(dunnett_result)

	 Simultaneous Confidence Intervals

Multiple Comparisons of Means: Dunnett Contrasts

Fit: aov(formula = response ~ group, data = drug_data)

Quantile = 2.394
95% family-wise confidence level

Linear Hypotheses:
                         Estimate lwr      upr     
Low_Dose - Control == 0     2.847  -3.131    8.825
Medium_Dose - Control == 0  8.234   2.256   14.212
High_Dose - Control == 0   12.156   6.178   18.134

Intervals that don’t include zero indicate significant differences from control.

Method 2: Using DescTools

For a simpler interface, DescTools provides DunnettTest():

# Perform Dunnett's test using DescTools
DunnettTest(response ~ group, data = drug_data)

  Dunnett's test for comparing several treatments with a control :  
    95% family-wise confidence level

$Control
                       diff    lwr.ci   upr.ci   pval    
Low_Dose-Control      2.847   -3.131    8.825  0.4889    
Medium_Dose-Control   8.234    2.256   14.212  0.0043 ** 
High_Dose-Control    12.156    6.178   18.134  <2e-16 ***

---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The DescTools output is more compact and directly shows confidence intervals alongside p-values. Both methods produce identical statistical results—choose based on your workflow preferences.

You can specify a different control group if needed:

# Specify control group explicitly
DunnettTest(response ~ group, data = drug_data, control = "Control")

Interpreting and Visualizing Results

From our results, we can conclude:

Low Dose vs Control: Not significant (p = 0.489). The confidence interval includes zero (-3.13 to 8.83).
Medium Dose vs Control: Significant (p = 0.004). Mean response is 8.2 units higher than control.
High Dose vs Control: Highly significant (p < 0.001). Mean response is 12.2 units higher than control.

Visualizing confidence intervals makes these results immediately interpretable:

# Plot confidence intervals from multcomp
plot(confint(dunnett_result))

For publication-quality graphics, use ggplot2:

# Extract results for ggplot
dunnett_summary <- as.data.frame(confint(dunnett_result)$confint)
dunnett_summary$comparison <- rownames(dunnett_summary)

# Clean up comparison names
dunnett_summary$comparison <- gsub(" - Control", "", dunnett_summary$comparison)

# Create forest plot
ggplot(dunnett_summary, aes(x = Estimate, y = comparison)) +
  geom_point(size = 3) +
  geom_errorbarh(aes(xmin = lwr, xmax = upr), height = 0.2) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "red") +
  labs(
    x = "Difference from Control (95% CI)",
    y = "Treatment",
    title = "Dunnett's Test: Treatment Effects vs Control"
  ) +
  theme_minimal() +
  theme(
    axis.text.y = element_text(size = 11),
    plot.title = element_text(hjust = 0.5)
  )

The red dashed line at zero represents no difference from control. Confidence intervals crossing this line indicate non-significant differences.

Conclusion and Best Practices

Before trusting your Dunnett’s test results, verify ANOVA assumptions:

# Check normality of residuals
shapiro.test(residuals(anova_model))

# Check homogeneity of variance
bartlett.test(response ~ group, data = drug_data)

# Visual diagnostics
par(mfrow = c(2, 2))
plot(anova_model)

If assumptions are violated, consider these alternatives:

Non-normal data: Use Dunn’s test with Holm adjustment after Kruskal-Wallis
Unequal variances: Use Games-Howell test or Welch’s ANOVA followed by pairwise t-tests with appropriate corrections
Small samples: Consider permutation-based approaches

Common pitfalls to avoid:

Wrong reference level: Always verify your control is the first factor level
Skipping ANOVA: Post-hoc tests without a significant omnibus test inflate false positives
One-sided vs two-sided: Default is two-sided; specify alternative = "greater" or alternative = "less" in glht() if you have directional hypotheses
Ignoring effect sizes: Statistical significance doesn’t equal practical importance—report confidence intervals and consider effect size measures

Dunnett’s test remains the gold standard for treatment-versus-control comparisons. Use it when your experimental design calls for it, verify your assumptions, and report both p-values and confidence intervals for complete transparency.