How to Perform Fisher's Exact Test in R
Fisher's Exact Test is a statistical significance test used to determine whether there's a non-random association between two categorical variables. Unlike the chi-square test, which relies on...
Key Insights
- Fisher’s Exact Test calculates exact probabilities rather than relying on approximations, making it the preferred choice when sample sizes are small or expected cell counts fall below 5.
- The
fisher.test()function in R handles everything from basic 2x2 tables to larger contingency tables, with options for one-sided tests and Monte Carlo simulation for computational efficiency. - Always examine the odds ratio and confidence interval alongside the p-value—statistical significance without practical effect size tells an incomplete story.
Introduction to Fisher’s Exact Test
Fisher’s Exact Test is a statistical significance test used to determine whether there’s a non-random association between two categorical variables. Unlike the chi-square test, which relies on asymptotic approximations that break down with small samples, Fisher’s test computes exact probabilities using the hypergeometric distribution.
Ronald Fisher developed this test in the 1930s, famously illustrating it with a story about a lady claiming she could tell whether milk or tea was added first to a cup. The test answers a simple question: given the marginal totals of a contingency table, what’s the probability of observing an arrangement as extreme as (or more extreme than) the one we actually observed, assuming no association exists?
The chi-square test assumes your expected cell frequencies are large enough for the chi-square distribution to approximate reality. When expected counts drop below 5, that assumption fails. Fisher’s test makes no such assumption—it works with the exact distribution of possible outcomes.
When to Use Fisher’s Exact Test
Use Fisher’s Exact Test when:
Small sample sizes dominate your data. If your total sample is under 1,000 and any expected cell count falls below 5, Fisher’s test is the safer choice. Some statisticians argue for using it whenever any expected count is below 10.
You’re analyzing 2x2 contingency tables. The test is computationally straightforward for 2x2 tables. Larger tables are possible but computationally expensive.
You need exact p-values. In regulatory contexts or when publishing in journals that require exact tests, Fisher’s provides mathematically precise probabilities.
Real-world applications include:
- Clinical trials with rare events: Testing whether a new treatment affects the occurrence of rare side effects when you have 50 patients total.
- A/B testing early results: Checking for significant differences when only 100 users have been exposed to each variant.
- Quality control: Determining if defect rates differ between two production lines with limited batch data.
- Epidemiological studies: Analyzing case-control studies with small numbers of exposed individuals.
Understanding the Data Structure
Fisher’s Exact Test operates on contingency tables—cross-tabulations showing the frequency distribution of variables. For a 2x2 table, you have two binary variables, each with two levels.
Here’s how to create a contingency table manually:
# Manual matrix creation
# Rows: Treatment (Drug vs Placebo)
# Columns: Outcome (Improved vs Not Improved)
treatment_data <- matrix(
c(8, 2, # Drug: 8 improved, 2 not improved
3, 7), # Placebo: 3 improved, 7 not improved
nrow = 2,
byrow = TRUE,
dimnames = list(
Treatment = c("Drug", "Placebo"),
Outcome = c("Improved", "Not Improved")
)
)
print(treatment_data)
Output:
Outcome
Treatment Improved Not Improved
Drug 8 2
Placebo 3 7
When working with raw data, use table() to build the contingency table:
# Raw data approach
patients <- data.frame(
treatment = c(rep("Drug", 10), rep("Placebo", 10)),
outcome = c(
# Drug group outcomes
rep("Improved", 8), rep("Not Improved", 2),
# Placebo group outcomes
rep("Improved", 3), rep("Not Improved", 7)
)
)
# Create contingency table from raw data
contingency_table <- table(patients$treatment, patients$outcome)
print(contingency_table)
The order of variables matters for interpretation. By convention, the exposure or treatment goes in rows, and the outcome goes in columns.
Running Fisher’s Exact Test in R
The fisher.test() function is your primary tool. At its simplest:
# Basic Fisher's Exact Test
result <- fisher.test(treatment_data)
print(result)
Output:
Fisher's Exact Test for Count Data
data: treatment_data
p-value = 0.06978
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.8641741 79.8295009
sample estimates:
odds ratio
8.369067
The function accepts several important parameters:
# Key parameters explained
fisher.test(
x, # Matrix or table
alternative = "two.sided", # "two.sided", "less", or "greater"
conf.level = 0.95, # Confidence level for odds ratio CI
simulate.p.value = FALSE, # Use Monte Carlo simulation
B = 2000 # Number of simulations if simulate.p.value = TRUE
)
For directional hypotheses, use one-sided tests:
# Two-sided test (default): Is there any association?
two_sided <- fisher.test(treatment_data, alternative = "two.sided")
cat("Two-sided p-value:", two_sided$p.value, "\n")
# One-sided test: Is drug BETTER than placebo?
# "greater" tests if odds ratio > 1
greater_test <- fisher.test(treatment_data, alternative = "greater")
cat("One-sided (greater) p-value:", greater_test$p.value, "\n")
# One-sided test: Is drug WORSE than placebo?
# "less" tests if odds ratio < 1
less_test <- fisher.test(treatment_data, alternative = "less")
cat("One-sided (less) p-value:", less_test$p.value, "\n")
Output:
Two-sided p-value: 0.06978
One-sided (greater) p-value: 0.03489
One-sided (less) p-value: 0.9919
Notice the one-sided p-value is exactly half the two-sided value when testing in the direction the data supports. Choose your alternative hypothesis before looking at the data—post-hoc selection invalidates your inference.
Interpreting the Results
The test output contains several components worth understanding:
p-value: The probability of observing results as extreme as yours (or more extreme) if the null hypothesis of no association is true. With α = 0.05, a p-value below 0.05 suggests statistically significant association.
Odds ratio: The ratio of odds of the outcome in one group versus the other. An odds ratio of 1 means no association. Values above 1 indicate the outcome is more likely in the first group; values below 1 indicate it’s less likely.
Confidence interval: The range within which the true odds ratio likely falls. If this interval excludes 1, the association is statistically significant at your chosen confidence level.
Extract specific values programmatically:
result <- fisher.test(treatment_data)
# Extract individual components
p_value <- result$p.value
odds_ratio <- result$estimate
ci_lower <- result$conf.int[1]
ci_upper <- result$conf.int[2]
cat("P-value:", round(p_value, 4), "\n")
cat("Odds Ratio:", round(odds_ratio, 2), "\n")
cat("95% CI: [", round(ci_lower, 2), ",", round(ci_upper, 2), "]\n")
# Interpretation logic
if (p_value < 0.05) {
cat("\nResult: Statistically significant association detected.\n")
} else {
cat("\nResult: No statistically significant association.\n")
}
# Effect size interpretation
if (odds_ratio > 1) {
cat("Direction: First group has higher odds of outcome.\n")
} else if (odds_ratio < 1) {
cat("Direction: First group has lower odds of outcome.\n")
}
In our example, the odds ratio of 8.37 suggests patients receiving the drug have about 8 times the odds of improvement compared to placebo. However, the wide confidence interval (0.86 to 79.83) spanning 1 and the p-value of 0.07 indicate this finding isn’t statistically significant at α = 0.05. The sample size is simply too small to draw firm conclusions, despite a seemingly large effect.
Extending to Larger Tables
Fisher’s test extends beyond 2x2 tables, but computational complexity grows rapidly. For larger tables, R uses a network algorithm that can become slow or memory-intensive.
# 3x3 contingency table example
# Testing association between education level and job satisfaction
job_data <- matrix(
c(12, 8, 5, # High school: Low/Medium/High satisfaction
15, 20, 10, # Bachelor's
8, 12, 25), # Graduate
nrow = 3,
byrow = TRUE,
dimnames = list(
Education = c("High School", "Bachelor's", "Graduate"),
Satisfaction = c("Low", "Medium", "High")
)
)
print(job_data)
# Standard Fisher's test (may be slow for larger tables)
result_3x3 <- fisher.test(job_data)
print(result_3x3)
For tables where exact computation is prohibitive, use Monte Carlo simulation:
# Monte Carlo simulation for computational efficiency
result_simulated <- fisher.test(
job_data,
simulate.p.value = TRUE,
B = 10000 # Number of Monte Carlo replicates
)
cat("Simulated p-value:", result_simulated$p.value, "\n")
Note that for tables larger than 2x2, Fisher’s test doesn’t provide an odds ratio—only a p-value. The test becomes a general test of independence rather than a measure of association strength.
Summary and Best Practices
Fisher’s Exact Test remains essential for analyzing categorical data when sample sizes preclude chi-square approximations. Here’s what to remember:
Choose Fisher’s over chi-square when: Any expected cell count falls below 5, your total sample is small, or you need exact (not approximate) p-values.
Structure your data correctly: Ensure your contingency table has the exposure/treatment in rows and outcome in columns for meaningful odds ratio interpretation.
Specify your hypothesis beforehand: Decide on one-sided versus two-sided testing before analyzing data. One-sided tests are more powerful but only appropriate when you have a directional hypothesis.
Report effect sizes: A p-value alone is insufficient. Always report the odds ratio and confidence interval. A non-significant p-value with a wide confidence interval suggests insufficient sample size, not absence of effect.
Watch for common pitfalls:
- Don’t use Fisher’s test for paired or matched data—use McNemar’s test instead.
- Don’t interpret a non-significant result as evidence of no association—it may simply reflect low statistical power.
- Don’t ignore the confidence interval width when interpreting results.
Consider alternatives when appropriate: For larger samples, chi-square tests are computationally efficient and produce similar results. For ordinal data, consider Cochran-Armitage trend tests. For stratified data, use Cochran-Mantel-Haenszel tests.
Fisher’s Exact Test does one thing exceptionally well: it tells you the exact probability of your observed data under the null hypothesis. Use it when precision matters and sample sizes are limited.