How to Plot the Weibull Distribution in R
The Weibull distribution is one of the most versatile probability distributions in applied statistics. Named after Swedish mathematician Waloddi Weibull, it excels at modeling time-to-failure data,...
Key Insights
- R provides four core Weibull functions (
dweibull(),pweibull(),qweibull(),rweibull()) that handle density, probability, quantiles, and random sampling respectively—master these and you can visualize any aspect of the distribution. - The shape parameter fundamentally changes the Weibull’s behavior: values below 1 indicate decreasing failure rates (infant mortality), values above 1 indicate increasing failure rates (wear-out), and exactly 1 reduces to the exponential distribution.
- For publication-quality plots, use ggplot2’s
stat_function()to draw smooth Weibull curves without manually generating sequences of points—it’s cleaner and more flexible for comparing multiple parameter combinations.
Introduction to the Weibull Distribution
The Weibull distribution is one of the most versatile probability distributions in applied statistics. Named after Swedish mathematician Waloddi Weibull, it excels at modeling time-to-failure data, making it indispensable in reliability engineering, survival analysis, and quality control.
The distribution has two parameters:
- Shape (k or β): Controls the failure rate behavior over time
- Scale (λ): Stretches or compresses the distribution along the x-axis
What makes the Weibull special is its flexibility. By adjusting the shape parameter, you can model:
- Decreasing failure rates (shape < 1): Early failures, “infant mortality” in components
- Constant failure rates (shape = 1): Random failures, equivalent to exponential distribution
- Increasing failure rates (shape > 1): Wear-out failures, aging effects
This article walks through plotting the Weibull distribution in R, from basic visualizations to publication-ready graphics.
Understanding Weibull Functions in R
R provides four built-in functions for working with the Weibull distribution. Each serves a distinct purpose:
# dweibull(): Probability Density Function (PDF)
# Returns the height of the density curve at a given point
dweibull(x = 2, shape = 1.5, scale = 1)
# pweibull(): Cumulative Distribution Function (CDF)
# Returns P(X <= x), the probability of observing a value less than or equal to x
pweibull(q = 2, shape = 1.5, scale = 1)
# qweibull(): Quantile Function (inverse CDF)
# Returns the value x such that P(X <= x) = p
qweibull(p = 0.5, shape = 1.5, scale = 1)
# rweibull(): Random Number Generation
# Generates random samples from the Weibull distribution
rweibull(n = 100, shape = 1.5, scale = 1)
A quick demonstration showing these functions in action:
# Set parameters
shape_param <- 2
scale_param <- 5
# Density at x = 3
density_at_3 <- dweibull(3, shape = shape_param, scale = scale_param)
cat("Density at x=3:", round(density_at_3, 4), "\n")
# Probability that X <= 3
prob_less_3 <- pweibull(3, shape = shape_param, scale = scale_param)
cat("P(X <= 3):", round(prob_less_3, 4), "\n")
# Median (50th percentile)
median_val <- qweibull(0.5, shape = shape_param, scale = scale_param)
cat("Median:", round(median_val, 4), "\n")
# Generate 5 random samples
set.seed(42)
samples <- rweibull(5, shape = shape_param, scale = scale_param)
cat("Random samples:", round(samples, 2), "\n")
Output:
Density at x=3: 0.1318
P(X <= 3): 0.3023
Median: 4.1628
Random samples: 6.84 3.3 5.36 5.72 5.16
Plotting the Probability Density Function (PDF)
The PDF shows the relative likelihood of different values. Here’s how to create a basic Weibull PDF plot using base R:
# Generate x values
x <- seq(0, 3, length.out = 200)
# Calculate density for shape = 1.5, scale = 1
y <- dweibull(x, shape = 1.5, scale = 1)
# Create the plot
plot(x, y,
type = "l",
lwd = 2,
col = "steelblue",
main = "Weibull Probability Density Function",
xlab = "x",
ylab = "Density",
las = 1)
More useful is comparing multiple shape parameters on one plot:
# Set up x values
x <- seq(0, 2.5, length.out = 300)
# Define shape parameters to compare
shapes <- c(0.5, 1, 1.5, 2, 5)
colors <- c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3", "#FF7F00")
# Create empty plot
plot(NULL,
xlim = c(0, 2.5),
ylim = c(0, 2.5),
xlab = "x",
ylab = "Density",
main = "Weibull PDF: Effect of Shape Parameter (scale = 1)",
las = 1)
# Add curves for each shape parameter
for (i in seq_along(shapes)) {
y <- dweibull(x, shape = shapes[i], scale = 1)
lines(x, y, col = colors[i], lwd = 2)
}
# Add legend
legend("topright",
legend = paste("shape =", shapes),
col = colors,
lwd = 2,
bty = "n")
# Add grid for readability
grid(col = "gray90")
This visualization immediately reveals how the shape parameter transforms the distribution. When shape is 0.5, the density is highest near zero and decreases monotonically—classic infant mortality behavior. When shape is 5, you get a bell-shaped curve skewed to the right, representing wear-out failure modes.
Plotting the Cumulative Distribution Function (CDF)
The CDF answers the question “What’s the probability of failure by time x?” This is often more directly useful in reliability analysis than the PDF.
# Generate x values
x <- seq(0, 3, length.out = 300)
# Set up plot area
par(mfrow = c(1, 2))
# Plot CDF
cdf <- pweibull(x, shape = 2, scale = 1)
plot(x, cdf,
type = "l",
lwd = 2,
col = "steelblue",
main = "CDF: F(x) = P(X ≤ x)",
xlab = "x",
ylab = "Cumulative Probability",
las = 1)
grid(col = "gray90")
# Plot Survival Function (complement of CDF)
survival <- 1 - cdf
plot(x, survival,
type = "l",
lwd = 2,
col = "darkred",
main = "Survival Function: S(x) = P(X > x)",
xlab = "x",
ylab = "Survival Probability",
las = 1)
grid(col = "gray90")
# Reset plot area
par(mfrow = c(1, 1))
The survival function is particularly important in reliability engineering. It directly answers “What’s the probability that a component survives past time x?”
Here’s a more practical example comparing survival curves for different shape parameters:
x <- seq(0, 3, length.out = 300)
shapes <- c(0.5, 1, 2, 3)
colors <- c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3")
plot(NULL,
xlim = c(0, 3),
ylim = c(0, 1),
xlab = "Time",
ylab = "Survival Probability",
main = "Weibull Survival Curves by Shape Parameter",
las = 1)
for (i in seq_along(shapes)) {
survival <- 1 - pweibull(x, shape = shapes[i], scale = 1)
lines(x, survival, col = colors[i], lwd = 2)
}
legend("topright",
legend = paste("shape =", shapes),
col = colors,
lwd = 2,
bty = "n")
abline(h = 0.5, lty = 2, col = "gray50")
text(2.8, 0.53, "Median", cex = 0.8)
Enhanced Visualizations with ggplot2
For publication-quality graphics, ggplot2 offers superior control over aesthetics. The stat_function() layer is particularly elegant for plotting mathematical functions.
library(ggplot2)
# Basic Weibull PDF with ggplot2
ggplot(data.frame(x = c(0, 3)), aes(x = x)) +
stat_function(fun = dweibull,
args = list(shape = 2, scale = 1),
linewidth = 1,
color = "steelblue") +
labs(title = "Weibull PDF (shape = 2, scale = 1)",
x = "x",
y = "Density") +
theme_minimal()
To compare multiple parameter combinations, create a data frame and use faceting:
library(ggplot2)
library(dplyr)
library(tidyr)
# Create parameter grid
params <- expand.grid(
shape = c(0.5, 1, 2, 5),
scale = c(0.5, 1, 2)
)
# Generate data for all combinations
x_vals <- seq(0.01, 4, length.out = 200)
plot_data <- params %>%
rowwise() %>%
mutate(data = list(data.frame(
x = x_vals,
density = dweibull(x_vals, shape = shape, scale = scale)
))) %>%
unnest(data) %>%
mutate(shape_label = paste("shape =", shape),
scale_label = paste("scale =", scale))
# Create faceted plot
ggplot(plot_data, aes(x = x, y = density, color = factor(shape))) +
geom_line(linewidth = 1) +
facet_wrap(~ scale_label, scales = "free_y") +
scale_color_brewer(palette = "Set1", name = "Shape") +
labs(title = "Weibull PDF: Shape and Scale Parameter Effects",
x = "x",
y = "Density") +
theme_minimal() +
theme(legend.position = "bottom")
For a cleaner approach using stat_function() with multiple curves:
library(ggplot2)
# Define shapes and create color mapping
shapes <- c(0.5, 1, 1.5, 2, 5)
ggplot(data.frame(x = c(0, 2.5)), aes(x = x)) +
stat_function(fun = dweibull, args = list(shape = 0.5, scale = 1),
aes(color = "0.5"), linewidth = 1) +
stat_function(fun = dweibull, args = list(shape = 1, scale = 1),
aes(color = "1"), linewidth = 1) +
stat_function(fun = dweibull, args = list(shape = 1.5, scale = 1),
aes(color = "1.5"), linewidth = 1) +
stat_function(fun = dweibull, args = list(shape = 2, scale = 1),
aes(color = "2"), linewidth = 1) +
stat_function(fun = dweibull, args = list(shape = 5, scale = 1),
aes(color = "5"), linewidth = 1) +
scale_color_brewer(palette = "Set1", name = "Shape Parameter") +
coord_cartesian(ylim = c(0, 2.5)) +
labs(title = "Weibull PDF Comparison",
subtitle = "Scale parameter fixed at 1",
x = "x",
y = "Density") +
theme_minimal() +
theme(legend.position = "right")
Practical Application: Fitting Weibull to Real Data
In practice, you’ll often need to fit a Weibull distribution to observed data. The MASS package provides fitdistr() for maximum likelihood estimation.
library(MASS)
library(ggplot2)
# Simulate failure time data (in practice, this would be real data)
set.seed(123)
failure_times <- rweibull(200, shape = 2.5, scale = 100)
# Fit Weibull distribution
fit <- fitdistr(failure_times, "weibull")
print(fit)
# Extract fitted parameters
fitted_shape <- fit$estimate["shape"]
fitted_scale <- fit$estimate["scale"]
cat("\nFitted shape:", round(fitted_shape, 3))
cat("\nFitted scale:", round(fitted_scale, 3))
Now overlay the fitted distribution on a histogram:
# Create histogram with fitted curve
ggplot(data.frame(x = failure_times), aes(x = x)) +
geom_histogram(aes(y = after_stat(density)),
bins = 25,
fill = "steelblue",
alpha = 0.7,
color = "white") +
stat_function(fun = dweibull,
args = list(shape = fitted_shape, scale = fitted_scale),
color = "darkred",
linewidth = 1.2) +
labs(title = "Failure Time Data with Fitted Weibull Distribution",
subtitle = sprintf("Fitted: shape = %.2f, scale = %.2f",
fitted_shape, fitted_scale),
x = "Failure Time",
y = "Density") +
theme_minimal()
For a base R version:
# Histogram with fitted curve (base R)
hist(failure_times,
breaks = 25,
probability = TRUE,
main = "Failure Times with Fitted Weibull",
xlab = "Failure Time",
col = "lightblue",
border = "white")
# Add fitted curve
curve(dweibull(x, shape = fitted_shape, scale = fitted_scale),
add = TRUE,
col = "darkred",
lwd = 2)
legend("topright",
legend = sprintf("Fitted Weibull\nshape = %.2f\nscale = %.2f",
fitted_shape, fitted_scale),
bty = "n")
Summary and Additional Resources
You now have the tools to visualize the Weibull distribution in R effectively. The key functions are straightforward:
dweibull()for density plots (PDF)pweibull()for cumulative probability plots (CDF) and survival curvesfitdistr()from MASS for parameter estimation from data
For base R, the plot() and curve() functions handle most needs. For publication-quality graphics, ggplot2’s stat_function() provides cleaner syntax and better aesthetics.
For deeper exploration, consider the survival package for Kaplan-Meier estimators and Cox regression, the flexsurv package for flexible parametric survival models, and the WeibullR package for specialized Weibull analysis tools used in reliability engineering.