How to Perform McNemar's Test in R

McNemar's test is a non-parametric statistical test for paired nominal data. You use it when you have the same subjects measured twice on a binary outcome, or when you have matched pairs where each...

Key Insights

  • McNemar’s test compares paired binary outcomes by analyzing only the discordant pairs—cases where responses changed between conditions—making it ideal for before/after studies and matched case-control designs.
  • Use the exact binomial method when your discordant pairs total fewer than 25; the chi-square approximation with continuity correction works well for larger samples.
  • The test answers a specific question: did the proportion of “yes” responses change between two conditions for the same subjects? It cannot tell you about unpaired groups or ordinal outcomes.

Introduction to McNemar’s Test

McNemar’s test is a non-parametric statistical test for paired nominal data. You use it when you have the same subjects measured twice on a binary outcome, or when you have matched pairs where each pair shares relevant characteristics.

The classic use cases include:

  • Before/after studies: Did patients improve after treatment?
  • Matched case-control studies: Comparing disease status between matched pairs
  • Method comparison: Do two diagnostic tests agree on the same patients?
  • Repeated surveys: Did opinions change between two time points?

The test has three key assumptions. First, your data must be paired—each observation in one condition has a corresponding observation in the other. Second, the outcome variable must be binary (yes/no, positive/negative, success/failure). Third, the pairs must be independent of each other, even though observations within pairs are dependent.

McNemar’s test specifically examines whether the marginal proportions differ. In plain terms: did the overall rate of “yes” responses change between the two conditions?

Understanding the Test Statistic

McNemar’s test operates on a 2x2 contingency table, but unlike a standard chi-square test, this table represents paired observations. Each cell shows counts of pairs, not individual observations.

The table structure looks like this:

Condition 2: Yes Condition 2: No
Condition 1: Yes a b
Condition 1: No c d

Cell a contains pairs where both conditions yielded “yes.” Cell d contains pairs where both yielded “no.” These are concordant pairs—they tell us nothing about change.

Cells b and c are the discordant pairs. Cell b shows pairs that switched from “yes” to “no.” Cell c shows pairs that switched from “no” to “yes.” McNemar’s test focuses entirely on these discordant pairs.

The test statistic follows a chi-square distribution with 1 degree of freedom:

$$\chi^2 = \frac{(b - c)^2}{b + c}$$

With continuity correction (recommended for smaller samples):

$$\chi^2 = \frac{(|b - c| - 1)^2}{b + c}$$

Let’s create a sample matrix in R:

# Create a 2x2 matrix for paired binary outcomes
# Rows: Condition 1 (Before), Columns: Condition 2 (After)
paired_data <- matrix(
  c(30, 12,   # Before Yes: 30 stayed Yes, 12 switched to No
    8, 50),   # Before No: 8 switched to Yes, 50 stayed No
  nrow = 2,
  byrow = TRUE,
  dimnames = list(
    "Before" = c("Yes", "No"),
    "After" = c("Yes", "No")
  )
)

print(paired_data)
#        After
# Before  Yes No
#    Yes  30 12
#    No    8 50

Here we have 100 total pairs. The concordant pairs (30 + 50 = 80) don’t factor into the test. The discordant pairs (12 + 8 = 20) drive the analysis.

Performing McNemar’s Test with Base R

R provides mcnemar.test() in the base stats package. No additional libraries required.

# Perform McNemar's test
result <- mcnemar.test(paired_data)
print(result)

#         McNemar's Chi-squared test with continuity correction
# 
# data:  paired_data
# McNemar's chi-squared = 0.45, df = 1, p-value = 0.5023

Interpreting the output: The chi-squared value of 0.45 with a p-value of 0.5023 indicates no statistically significant difference in the marginal proportions. Before treatment, 42 subjects said “yes” (30 + 12). After treatment, 38 subjects said “yes” (30 + 8). This difference isn’t statistically significant.

You can also pass raw data in table form:

# Alternative: Create from raw paired observations
before <- c(rep("Yes", 42), rep("No", 58))
after <- c(rep("Yes", 30), rep("No", 12), rep("Yes", 8), rep("No", 50))

# Create contingency table
contingency <- table(before, after)
mcnemar.test(contingency)

The function returns a list object. Access specific components programmatically:

result$statistic  # Chi-squared value
result$p.value    # P-value
result$parameter  # Degrees of freedom

Exact vs. Asymptotic Methods

The chi-square approximation works well when you have sufficient discordant pairs. The rule of thumb: use the exact test when b + c < 25.

By default, mcnemar.test() applies Yates’ continuity correction. You can disable it:

# Without continuity correction
mcnemar.test(paired_data, correct = FALSE)

#         McNemar's Chi-squared test
# 
# data:  paired_data
# McNemar's chi-squared = 0.8, df = 1, p-value = 0.3711

Notice the p-value changes. The uncorrected version is less conservative.

For small samples, use the exact binomial test. Under the null hypothesis, discordant pairs should split 50/50 between cells b and c:

# Exact McNemar's test using binomial test
b <- 12  # Changed from Yes to No
c <- 8   # Changed from No to Yes

binom.test(b, b + c, p = 0.5)

#         Exact binomial test
# 
# data:  b and b + c
# number of successes = 12, number of trials = 20, p-value = 0.5034
# alternative hypothesis: true probability of success is not equal to 0.5
# 95 percent confidence interval:
#  0.3605 0.8088
# sample estimates:
# probability of success 
#                    0.6

The exact2x2 package provides a more direct interface:

# Install if needed: install.packages("exact2x2")
library(exact2x2)

mcnemar.exact(paired_data)

Use the exact test for small discordant pair counts. Use the asymptotic test with continuity correction for moderate samples. For large samples (b + c > 100), the correction matters less.

Real-World Example: Treatment Effectiveness

Let’s work through a complete example. A clinic tests whether a new medication reduces chronic headaches. They survey 150 patients before and after a 3-month treatment period, asking: “Did you experience chronic headaches this month?”

# Patient headache data: Before and After treatment
# Create the paired outcome matrix
headache_data <- matrix(
  c(25, 35,   # Had headaches before: 25 still have them, 35 improved
    10, 80),  # No headaches before: 10 developed them, 80 still fine
  nrow = 2,
  byrow = TRUE,
  dimnames = list(
    "Before Treatment" = c("Headaches", "No Headaches"),
    "After Treatment" = c("Headaches", "No Headaches")
  )
)

print(headache_data)
#                   After Treatment
# Before Treatment   Headaches No Headaches
#   Headaches               25           35
#   No Headaches            10           80

# Calculate marginal totals
cat("Before treatment headache rate:", (25 + 35) / 150 * 100, "%\n")
cat("After treatment headache rate:", (25 + 10) / 150 * 100, "%\n")
# Before treatment headache rate: 40%
# After treatment headache rate: 23.33%

# Check discordant pairs
discordant_total <- 35 + 10
cat("Total discordant pairs:", discordant_total, "\n")
# Total discordant pairs: 45

# Since discordant pairs > 25, asymptotic test is appropriate
result <- mcnemar.test(headache_data)
print(result)

#         McNemar's Chi-squared test with continuity correction
# 
# data:  headache_data
# McNemar's chi-squared = 12.8, df = 1, p-value = 0.000347

# For comparison, exact test
binom.test(35, 45, p = 0.5)
# p-value = 0.0003019

The results show a statistically significant change (p < 0.001). Of the 45 patients whose headache status changed, 35 improved while only 10 got worse. The treatment appears effective.

Reporting and Visualization

When reporting McNemar’s test results, include the discordant pair counts, the test statistic, and the p-value:

“McNemar’s test revealed a significant change in headache prevalence following treatment (χ² = 12.8, df = 1, p < 0.001). Of the 45 patients whose status changed, 35 (77.8%) improved while 10 (22.2%) worsened.”

Visualize the contingency table with a heatmap:

library(ggplot2)
library(tidyr)

# Convert matrix to data frame for ggplot
df <- as.data.frame(as.table(headache_data))
names(df) <- c("Before", "After", "Count")

# Create heatmap
ggplot(df, aes(x = After, y = Before, fill = Count)) +
  geom_tile(color = "white", linewidth = 1) +
  geom_text(aes(label = Count), size = 8, fontface = "bold") +
  scale_fill_gradient(low = "#f7fbff", high = "#2171b5") +
  labs(
    title = "Headache Status: Before vs. After Treatment",
    subtitle = "Discordant cells (off-diagonal) drive McNemar's test",
    x = "After Treatment",
    y = "Before Treatment"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    panel.grid = element_blank(),
    legend.position = "none"
  )

For a quick base R visualization:

# Mosaic plot
mosaicplot(headache_data, 
           main = "Headache Status Before/After Treatment",
           color = c("#d73027", "#4575b4"),
           las = 1)

Common Pitfalls and Alternatives

Using McNemar’s test on unpaired data. This is the most common mistake. If your groups are independent (different subjects in each condition), use a chi-square test or Fisher’s exact test instead.

Ignoring the discordant pair count. A large sample doesn’t guarantee reliable results. What matters is the number of discordant pairs. With only 10 discordant pairs, your test has low power regardless of total sample size.

Misinterpreting the null hypothesis. McNemar’s test examines whether the marginal proportions differ, not whether individual subjects changed. The null hypothesis states that the probability of changing from Yes to No equals the probability of changing from No to Yes.

Forgetting about effect size. Statistical significance doesn’t imply practical importance. Report the odds ratio for discordant pairs: OR = b/c. In our headache example, OR = 35/10 = 3.5, meaning patients were 3.5 times more likely to improve than worsen.

Alternatives to consider:

  • Cochran’s Q test: Extends McNemar’s to three or more related groups (multiple time points)
  • Stuart-Maxwell test: Handles paired data with more than two categories
  • Generalized Estimating Equations (GEE): For more complex repeated measures designs with covariates
# Cochran's Q for multiple time points
# install.packages("RVAideMemoire")
library(RVAideMemoire)

# Example: Same patients measured at 3 time points
multitime <- matrix(
  c(1,1,0, 1,0,0, 0,1,1, 1,1,1, 0,0,0, 1,0,1),
  ncol = 3, byrow = TRUE
)
cochran.qtest(multitime)

McNemar’s test remains the right choice for simple paired binary comparisons. Use it correctly, report it completely, and your analysis will stand up to scrutiny.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.