How to Perform a One-Proportion Z-Test in R

Key Insights

The one-proportion z-test determines whether a sample proportion significantly differs from a hypothesized population proportion—essential for quality control, A/B testing, and survey validation.
R’s built-in prop.test() function handles the heavy lifting, but understanding the manual calculation helps you interpret results correctly and troubleshoot edge cases.
Always verify your assumptions (random sampling, binary outcomes, sufficient sample size) before trusting your p-values—violated assumptions produce misleading conclusions.

Introduction

The one-proportion z-test answers a simple but powerful question: does my observed proportion differ significantly from what I expected? You’re comparing a single sample proportion against a known or hypothesized population proportion.

This test appears constantly in applied statistics. Quality engineers use it to determine if defect rates exceed acceptable thresholds. Marketing analysts test whether conversion rates match industry benchmarks. Researchers validate survey responses against known population demographics. If you’re working with binary outcomes and a reference proportion, this is your tool.

R makes this straightforward with prop.test(), but understanding the mechanics behind the function separates competent analysts from those who blindly trust output. Let’s build that understanding.

Assumptions and Requirements

Before running any hypothesis test, verify your assumptions. The one-proportion z-test requires four conditions:

Random sampling: Your observations must be randomly selected from the population of interest. Convenience samples or self-selected respondents violate this assumption and bias your results.

Binary outcome: Each observation falls into exactly one of two categories—success or failure, defective or non-defective, clicked or didn’t click. No middle ground.

Sufficient sample size: The normal approximation underlying the z-test requires both np ≥ 10 and n(1-p) ≥ 10, where n is your sample size and p is the hypothesized proportion. This ensures the sampling distribution is approximately normal.

Independence: Each observation must be independent of others. In practice, this typically means sampling without replacement from a population at least 10 times larger than your sample.

Violating these assumptions doesn’t necessarily invalidate your analysis, but it does require caution. Small samples may need exact binomial tests instead. Non-random sampling requires careful qualification of your conclusions. When in doubt, be conservative in your interpretations.

The Mathematical Foundation

The z-test statistic measures how many standard errors your sample proportion falls from the hypothesized proportion:

z = (p̂ - p₀) / √(p₀(1-p₀)/n)

Where:

p̂ (p-hat) is your observed sample proportion
p₀ is the hypothesized population proportion
n is your sample size
The denominator is the standard error under the null hypothesis

The null hypothesis states that the true population proportion equals the hypothesized value (H₀: p = p₀). The alternative hypothesis depends on your research question—it can be two-tailed (p ≠ p₀) or one-tailed (p > p₀ or p < p₀).

Here’s how to calculate this manually in R:

# Manual one-proportion z-test calculation
# Scenario: 58 successes out of 100 trials, testing against p = 0.50

n <- 100          # sample size
x <- 58           # number of successes
p_hat <- x / n    # sample proportion (0.58)
p_0 <- 0.50       # hypothesized proportion

# Calculate z-statistic
standard_error <- sqrt(p_0 * (1 - p_0) / n)
z_stat <- (p_hat - p_0) / standard_error

# Calculate two-tailed p-value
p_value <- 2 * (1 - pnorm(abs(z_stat)))

# Display results
cat("Sample proportion:", p_hat, "\n")
cat("Z-statistic:", round(z_stat, 4), "\n")
cat("P-value (two-tailed):", round(p_value, 4), "\n")

Output:

Sample proportion: 0.58 
Z-statistic: 1.6 
P-value (two-tailed): 0.1096

This manual approach clarifies exactly what’s happening. The sample proportion of 0.58 is 1.6 standard errors above the hypothesized 0.50, yielding a p-value of approximately 0.11—not significant at the conventional α = 0.05 level.

Using `prop.test()` in Base R

While manual calculation builds understanding, prop.test() is your production tool. It’s built into base R, handles edge cases gracefully, and provides confidence intervals automatically.

# Basic prop.test() syntax
# Testing if defect rate differs from 5% acceptable threshold

defective <- 18      # defective items found
total_inspected <- 250  # total items inspected

result <- prop.test(
  x = defective,
  n = total_inspected,
  p = 0.05,           # hypothesized proportion (5%)
  alternative = "two.sided",
  correct = TRUE      # Yates' continuity correction (default)
)

print(result)

Output:

	1-sample proportions test with continuity correction

data:  defective out of total_inspected, null probability 0.05
X-squared = 4.805, df = 1, p-value = 0.02838
alternative hypothesis: true p is not equal to 0.05
95 percent confidence interval:
 0.04461498 0.11172217
sample estimates:
    p 
0.072

Key parameters explained:

x: Number of successes (or events of interest)
n: Total number of trials
p: The null hypothesis proportion you’re testing against
alternative: Direction of test ("two.sided", "greater", or "less")
correct: Whether to apply Yates’ continuity correction (recommended for small samples)

Note that prop.test() reports a chi-squared statistic rather than a z-statistic. They’re mathematically equivalent—the chi-squared value equals z². In this case, √4.805 ≈ 2.19, which is the z-statistic.

The p-value of 0.028 indicates significant evidence that the true defect rate differs from 5%. With an observed rate of 7.2%, this manufacturer has a quality problem.

One-Tailed vs. Two-Tailed Tests

Your research question dictates which test to use. Two-tailed tests ask “is there a difference?” One-tailed tests ask “is it specifically higher?” or “is it specifically lower?”

Use one-tailed tests when:

You have a directional hypothesis before seeing the data
Only one direction of effect matters practically
You want more statistical power in that specific direction

Use two-tailed tests when:

You’re exploring whether any difference exists
Deviations in either direction would be meaningful
You’re unsure about the direction of effect

# Comparing all three alternative hypothesis options
# Scenario: Testing if customer satisfaction (72/100) exceeds 65% benchmark

successes <- 72
trials <- 100
benchmark <- 0.65

# Two-tailed: Is satisfaction different from 65%?
two_tailed <- prop.test(successes, trials, p = benchmark, 
                        alternative = "two.sided")

# Greater: Is satisfaction higher than 65%?
greater <- prop.test(successes, trials, p = benchmark, 
                     alternative = "greater")

# Less: Is satisfaction lower than 65%?
less <- prop.test(successes, trials, p = benchmark, 
                  alternative = "less")

# Compare p-values
cat("Two-tailed p-value:", round(two_tailed$p.value, 4), "\n")
cat("Greater p-value:", round(greater$p.value, 4), "\n")
cat("Less p-value:", round(less$p.value, 4), "\n")

Output:

Two-tailed p-value: 0.1749 
Greater p-value: 0.0875 
Less p-value: 0.9335

Notice the relationship: the one-tailed p-value in the direction of the observed effect (greater) is exactly half the two-tailed p-value. The opposite direction yields a very large p-value because the data strongly contradicts that hypothesis.

Confidence Intervals and Effect Size

Statistical significance tells you whether an effect exists. Confidence intervals and effect sizes tell you whether it matters.

Extract the confidence interval from prop.test() output:

# Extracting and interpreting confidence intervals
result <- prop.test(x = 156, n = 400, p = 0.35)

# Access specific components
observed_prop <- result$estimate
conf_interval <- result$conf.int
conf_level <- attr(result$conf.int, "conf.level")

cat("Observed proportion:", observed_prop, "\n")
cat(conf_level * 100, "% CI: [", 
    round(conf_interval[1], 4), ", ", 
    round(conf_interval[2], 4), "]\n", sep = "")

# Cohen's h for effect size
# Measures the difference between two proportions on an arcsine scale
p_observed <- 156 / 400  # 0.39
p_null <- 0.35

cohens_h <- 2 * asin(sqrt(p_observed)) - 2 * asin(sqrt(p_null))
cat("Cohen's h:", round(cohens_h, 4), "\n")
cat("Effect size interpretation: ", 
    ifelse(abs(cohens_h) < 0.2, "small",
    ifelse(abs(cohens_h) < 0.5, "small-to-medium", 
    ifelse(abs(cohens_h) < 0.8, "medium", "large"))), "\n")

Output:

Observed proportion: 0.39 
95% CI: [0.3421, 0.4403]
Cohen's h: 0.0823 
Effect size interpretation: small

The confidence interval (34.2% to 44.0%) contains the null value of 35%, consistent with the non-significant p-value. Cohen’s h of 0.08 indicates a trivially small effect—even if this were statistically significant, the practical difference is negligible.

Complete Worked Example

Let’s walk through a realistic analysis from start to finish. An e-commerce company claims their website converts 3.5% of visitors to customers. After a site redesign, you collect data on 2,000 visitors and observe 84 conversions. Did the redesign change the conversion rate?

# =============================================================
# One-Proportion Z-Test: Website Conversion Rate Analysis
# =============================================================

# --- Step 1: Define the problem and hypotheses ---
# H0: p = 0.035 (conversion rate equals historical 3.5%)
# Ha: p ≠ 0.035 (conversion rate has changed)
# Significance level: α = 0.05

conversions <- 84
visitors <- 2000
historical_rate <- 0.035

# --- Step 2: Check assumptions ---
# Random sampling: Visitors during test period (assumed representative)
# Binary outcome: Converted or didn't convert ✓
# Sample size check:
np <- visitors * historical_rate
nq <- visitors * (1 - historical_rate)

cat("Assumption check:\n")
cat("  np =", np, "(need ≥ 10) ✓\n")
cat("  n(1-p) =", nq, "(need ≥ 10) ✓\n\n")

# --- Step 3: Conduct the test ---
test_result <- prop.test(
  x = conversions,
  n = visitors,
  p = historical_rate,
  alternative = "two.sided",
  correct = TRUE
)

print(test_result)

# --- Step 4: Extract and interpret results ---
cat("\n--- Results Summary ---\n")
cat("Observed conversion rate:", 
    round(test_result$estimate * 100, 2), "%\n")
cat("Historical benchmark:", historical_rate * 100, "%\n")
cat("Difference:", 
    round((test_result$estimate - historical_rate) * 100, 2), 
    "percentage points\n")
cat("95% CI:", 
    round(test_result$conf.int[1] * 100, 2), "% to",
    round(test_result$conf.int[2] * 100, 2), "%\n")
cat("P-value:", round(test_result$p.value, 4), "\n")

# --- Step 5: Calculate effect size ---
cohens_h <- 2 * asin(sqrt(conversions/visitors)) - 
            2 * asin(sqrt(historical_rate))
cat("Cohen's h:", round(cohens_h, 3), "\n")

# --- Step 6: State conclusion ---
alpha <- 0.05
if (test_result$p.value < alpha) {
  cat("\nConclusion: Reject H0. The conversion rate has significantly changed.\n")
} else {
  cat("\nConclusion: Fail to reject H0. No significant change detected.\n")
}

Output:

Assumption check:
  np = 70 (need ≥ 10) ✓
  n(1-p) = 1930 (need ≥ 10) ✓

	1-sample proportions test with continuity correction

data:  conversions out of visitors, null probability 0.035
X-squared = 1.6498, df = 1, p-value = 0.199
alternative hypothesis: true p is not equal to 0.035
95 percent confidence interval:
 0.03384498 0.05175498
sample estimates:
    p 
0.042 

--- Results Summary ---
Observed conversion rate: 4.2 %
Historical benchmark: 3.5 %
Difference: 0.7 percentage points
95% CI: 3.38 % to 5.18 %
P-value: 0.199
Cohen's h: 0.037

Conclusion: Fail to reject H0. No significant change detected.

The observed conversion rate of 4.2% is higher than the historical 3.5%, but this difference isn’t statistically significant (p = 0.199). The 95% confidence interval includes the historical rate, and Cohen’s h of 0.037 indicates a trivial effect size. The redesign hasn’t demonstrably changed conversion rates—at least not with this sample size.

The one-proportion z-test is a foundational tool that you’ll use repeatedly. Master prop.test(), understand its assumptions, and always pair statistical significance with practical significance through confidence intervals and effect sizes.

Introduction

Assumptions and Requirements

The Mathematical Foundation

Using prop.test() in Base R

One-Tailed vs. Two-Tailed Tests

Confidence Intervals and Effect Size

Complete Worked Example

Liked this? There's more.

Similar Articles

Using `prop.test()` in Base R