Uniform Distribution in R: Complete Guide

Key Insights

R provides four core functions for uniform distribution (runif, dunif, punif, qunif) that handle random generation, density, cumulative probability, and quantiles respectively
The uniform distribution is essential for Monte Carlo simulations, random sampling, and generating test data, but should only be used when all outcomes have truly equal probability
Setting a seed with set.seed() before generating random uniform numbers ensures reproducibility across runs—critical for debugging and sharing analysis code

Introduction to Uniform Distribution

The uniform distribution is the simplest probability distribution where all values within a specified range have equal probability of occurring. In the continuous case, every interval of equal length has the same probability. In the discrete case, each distinct outcome has identical probability.

Understanding uniform distributions is fundamental for data scientists and statisticians. You’ll use them for random number generation in simulations, creating synthetic test datasets, implementing Monte Carlo methods, and as building blocks for more complex probability distributions through inverse transform sampling.

Let’s visualize the difference between uniform and normal distributions:

# Compare uniform vs normal distribution
set.seed(42)
uniform_data <- runif(10000, min = 0, max = 10)
normal_data <- rnorm(10000, mean = 5, sd = 2)

par(mfrow = c(1, 2))
hist(uniform_data, breaks = 50, main = "Uniform Distribution",
     xlab = "Value", col = "lightblue", probability = TRUE)
hist(normal_data, breaks = 50, main = "Normal Distribution",
     xlab = "Value", col = "lightcoral", probability = TRUE)

The uniform distribution shows a flat, rectangular shape where all values between 0 and 10 appear with roughly equal frequency. The normal distribution clusters around the mean with a bell curve.

R Functions for Uniform Distribution

R follows a consistent naming convention for distribution functions. The uniform distribution functions all end in unif, with a prefix indicating their purpose:

runif(): random generation - generates random values from a uniform distribution
dunif(): density - returns the probability density at specific points
punif(): probability - calculates cumulative distribution (P(X ≤ x))
qunif(): quantile - finds values corresponding to given probabilities (inverse CDF)

All four functions share common parameters: min (default 0) and max (default 1) define the distribution range.

Here’s how these functions relate to each other:

# Generate random values
random_vals <- runif(5, min = 0, max = 10)
print(random_vals)
# [1] 9.148060 2.930507 2.861395 8.304476 6.417455

# Density at x = 5 (height of PDF)
dunif(5, min = 0, max = 10)
# [1] 0.1

# Cumulative probability P(X <= 5)
punif(5, min = 0, max = 10)
# [1] 0.5

# Value at 50th percentile (inverse of punif)
qunif(0.5, min = 0, max = 10)
# [1] 5

Notice that punif(5, 0, 10) returns 0.5, and qunif(0.5, 0, 10) returns 5—they’re inverses of each other.

Generating Random Uniform Numbers

The runif() function is your workhorse for generating random uniform numbers. The first parameter n specifies how many values to generate.

# Generate 10 random numbers between 0 and 1 (default)
runif(10)

# Generate 5 random numbers between 10 and 20
runif(5, min = 10, max = 20)

# Generate random numbers for custom ranges
prices <- runif(1000, min = 9.99, max = 99.99)
temperatures <- runif(365, min = -10, max = 35)

Reproducibility is critical in data science. Use set.seed() to ensure your random numbers are identical across runs:

# Without seed - different results each time
runif(3)
# [1] 0.2875775 0.7883051 0.4089769
runif(3)
# [1] 0.8830174 0.9404673 0.0455565

# With seed - reproducible results
set.seed(123)
runif(3)
# [1] 0.2875775 0.7883051 0.4089769

set.seed(123)
runif(3)
# [1] 0.2875775 0.7883051 0.4089769  # Identical!

To generate uniform random integers, use floor() or sample():

# Simulate 10 dice rolls (1-6)
dice_rolls <- floor(runif(10, min = 1, max = 7))
print(dice_rolls)

# Better approach for integers: use sample()
dice_rolls <- sample(1:6, size = 10, replace = TRUE)

While runif() with floor() works, sample() is more explicit and efficient for discrete uniform distributions.

Probability Density and Cumulative Distribution

The dunif() function returns the probability density. For a uniform distribution on [a, b], the density is constant at 1/(b-a):

# Density for uniform distribution on [0, 10]
dunif(5, min = 0, max = 10)
# [1] 0.1  # This is 1/(10-0)

# Density is constant across the range
dunif(c(0, 2.5, 5, 7.5, 10), min = 0, max = 10)
# [1] 0.1 0.1 0.1 0.1 0.1

# Outside the range, density is 0
dunif(-1, min = 0, max = 10)
# [1] 0

The punif() function calculates cumulative probability—the probability that a random value is less than or equal to x:

# P(X <= 5) for uniform on [0, 10]
punif(5, min = 0, max = 10)
# [1] 0.5  # 50% chance

# P(X <= 7.5)
punif(7.5, min = 0, max = 10)
# [1] 0.75  # 75% chance

# Calculate probability of range: P(3 < X <= 7)
punif(7, 0, 10) - punif(3, 0, 10)
# [1] 0.4  # 40% chance

Visualize the PDF and CDF together:

x <- seq(0, 10, length.out = 1000)
pdf <- dunif(x, min = 0, max = 10)
cdf <- punif(x, min = 0, max = 10)

par(mfrow = c(1, 2))
plot(x, pdf, type = "l", lwd = 2, col = "blue",
     main = "PDF: Uniform(0, 10)", ylab = "Density")
plot(x, cdf, type = "l", lwd = 2, col = "red",
     main = "CDF: Uniform(0, 10)", ylab = "Cumulative Probability")

Quantile Function and Inverse CDF

The qunif() function finds the value corresponding to a given probability—it’s the inverse of punif(). This is essential for finding percentiles and confidence intervals:

# Find quartiles for uniform distribution on [0, 100]
quartiles <- qunif(c(0.25, 0.5, 0.75), min = 0, max = 100)
print(quartiles)
# [1] 25 50 75

# Find 95th percentile
qunif(0.95, min = 0, max = 100)
# [1] 95

# Find 5th and 95th percentiles (90% confidence interval)
qunif(c(0.05, 0.95), min = 0, max = 100)
# [1]  5 95

Inverse transform sampling uses qunif() to generate stratified samples with better coverage than pure random sampling:

# Regular random sampling (may have gaps)
set.seed(42)
regular_sample <- runif(10, 0, 1)

# Stratified sampling using quantiles
set.seed(42)
probs <- (1:10 - 0.5) / 10  # 0.05, 0.15, 0.25, ..., 0.95
stratified_sample <- qunif(probs, 0, 1)

par(mfrow = c(1, 2))
plot(regular_sample, rep(1, 10), xlim = c(0, 1), ylim = c(0.5, 1.5),
     main = "Regular Random", xlab = "Value", ylab = "", yaxt = "n")
plot(stratified_sample, rep(1, 10), xlim = c(0, 1), ylim = c(0.5, 1.5),
     main = "Stratified", xlab = "Value", ylab = "", yaxt = "n")

Stratified sampling ensures more even coverage of the probability space—useful in Monte Carlo simulations and survey sampling.

Practical Applications and Visualizations

Monte Carlo simulations frequently use uniform distributions. Here’s a classic example estimating π:

# Estimate pi by randomly throwing darts at a square
set.seed(123)
n <- 100000
x <- runif(n, -1, 1)
y <- runif(n, -1, 1)

# Count points inside unit circle
inside_circle <- (x^2 + y^2) <= 1
pi_estimate <- 4 * sum(inside_circle) / n
print(pi_estimate)
# [1] 3.14204  # Close to actual π = 3.14159...

Create visualizations with ggplot2 for multiple uniform distributions:

library(ggplot2)

# Generate data from different uniform distributions
data <- data.frame(
  value = c(runif(1000, 0, 10),
            runif(1000, 5, 15),
            runif(1000, 10, 20)),
  distribution = rep(c("U(0,10)", "U(5,15)", "U(10,20)"), each = 1000)
)

ggplot(data, aes(x = value, fill = distribution)) +
  geom_histogram(alpha = 0.6, position = "identity", bins = 50) +
  theme_minimal() +
  labs(title = "Comparison of Uniform Distributions",
       x = "Value", y = "Frequency")

Test if data follows a uniform distribution using the Kolmogorov-Smirnov test:

# Generate uniform data
uniform_sample <- runif(1000, 0, 1)

# Test for uniformity
ks.test(uniform_sample, "punif", min = 0, max = 1)
# p-value should be > 0.05 (fails to reject uniformity)

# Test non-uniform data
normal_sample <- rnorm(1000, 0.5, 0.2)
ks.test(normal_sample, "punif", min = 0, max = 1)
# p-value should be < 0.05 (rejects uniformity)

Common Pitfalls and Best Practices

Don’t assume all random processes are uniform. Real-world data rarely follows a uniform distribution. Human choices, natural phenomena, and measurement errors typically show clustering or bias. Use uniform distributions for simulation inputs, not to model observed data without justification.

Watch for boundary behavior:

# Incorrect: trying to generate integers 1-10
wrong_dice <- floor(runif(100, 1, 10))
table(wrong_dice)  # Will never produce 10!

# Correct: extend upper bound
correct_dice <- floor(runif(100, 1, 11))
table(correct_dice)  # Produces 1-10 uniformly

Performance matters for large simulations:

# Inefficient: loop with single calls
system.time({
  result <- numeric(1000000)
  for(i in 1:1000000) result[i] <- runif(1)
})

# Efficient: vectorized call
system.time({
  result <- runif(1000000)
})
# Vectorized is 100x+ faster

Always set seeds for reproducible research:

# Bad practice - unreproducible
results <- runif(100)

# Good practice - reproducible
set.seed(42)
results <- runif(100)

The uniform distribution is deceptively simple but incredibly powerful. Master these functions and you’ll have a solid foundation for more advanced statistical computing, simulation, and probabilistic programming in R.