How to Perform the Kolmogorov-Smirnov Test in R

The Kolmogorov-Smirnov (K-S) test is a nonparametric test that compares probability distributions. Unlike tests that focus on specific moments like mean or variance, the K-S test examines the entire...

Key Insights

  • The Kolmogorov-Smirnov test compares distributions by measuring the maximum vertical distance between cumulative distribution functions, making it useful for both normality testing and comparing two empirical samples.
  • Always use the Lilliefors correction when testing normality with estimated parameters—the standard K-S test produces inflated p-values when you estimate mean and standard deviation from the same data.
  • The K-S test struggles with tied values in discrete data; use the exact = FALSE parameter to suppress warnings and get asymptotic p-values when working with rounded or categorical-like continuous data.

Introduction to the Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov (K-S) test is a nonparametric test that compares probability distributions. Unlike tests that focus on specific moments like mean or variance, the K-S test examines the entire shape of distributions by comparing their cumulative distribution functions (CDFs).

The test comes in two variants. The one-sample K-S test compares your data against a theoretical distribution—testing whether your sample could have been drawn from a normal, uniform, exponential, or any other specified distribution. The two-sample K-S test compares two empirical samples to determine if they come from the same underlying distribution.

When should you reach for the K-S test instead of alternatives like Shapiro-Wilk or Anderson-Darling? The K-S test shines when you need distribution-free comparisons and when you’re comparing two samples rather than testing against a theoretical distribution. It’s also useful when you care about the entire distribution shape, not just normality. However, for pure normality testing with a single sample, Shapiro-Wilk typically has more statistical power.

Prerequisites and Setup

The K-S test lives in R’s built-in stats package, so you don’t need to install anything for basic functionality. For visualization and the Lilliefors correction, you’ll want a few additional packages.

# Core functionality - already loaded with base R
# library(stats)

# For visualization
library(ggplot2)

# For Lilliefors test (K-S with estimated parameters)
library(nortest)

# Generate sample data for examples
set.seed(42)
normal_data <- rnorm(100, mean = 50, sd = 10)
skewed_data <- rexp(100, rate = 0.1)
comparison_data <- rnorm(100, mean = 52, sd = 10)

One-Sample K-S Test: Testing Against a Known Distribution

The one-sample test answers this question: “Could my data have come from this specific distribution?” The test calculates the D statistic—the maximum absolute difference between your sample’s empirical CDF and the theoretical CDF.

# Test if data follows a standard normal distribution
result <- ks.test(normal_data, "pnorm", mean = 50, sd = 10)
print(result)
	Exact one-sample Kolmogorov-Smirnov test

data:  normal_data
D = 0.063385, p-value = 0.8037
alternative hypothesis: two-sided

The output gives you two key values. The D statistic (0.063) represents the maximum vertical distance between the empirical and theoretical CDFs. Smaller values indicate better fit. The p-value (0.80) tells you the probability of observing a D statistic this extreme if the data truly came from the specified distribution. A high p-value means you cannot reject the null hypothesis—the data is consistent with the theoretical distribution.

You can test against any distribution R knows about:

# Test against uniform distribution
uniform_sample <- runif(100, min = 0, max = 1)
ks.test(uniform_sample, "punif", min = 0, max = 1)

# Test against exponential distribution
exp_sample <- rexp(100, rate = 2)
ks.test(exp_sample, "pexp", rate = 2)

# Test skewed data against normal - should reject
ks.test(skewed_data, "pnorm", mean = mean(skewed_data), sd = sd(skewed_data))

Two-Sample K-S Test: Comparing Two Datasets

The two-sample variant compares two empirical distributions directly. This is invaluable for A/B testing, before/after comparisons, or validating that two data sources follow the same distribution.

# Compare two samples
two_sample_result <- ks.test(normal_data, comparison_data)
print(two_sample_result)
	Asymptotic two-sample Kolmogorov-Smirnov test

data:  normal_data and comparison_data
D = 0.13, p-value = 0.3521
alternative hypothesis: two-sided

The interpretation changes slightly here. A high p-value (0.35) indicates insufficient evidence to conclude the distributions differ. The samples could plausibly come from the same underlying distribution.

Here’s a practical A/B testing scenario:

# Simulated response times (milliseconds) from two server configurations
set.seed(123)
server_a_times <- rgamma(150, shape = 2, rate = 0.01)  # Original config
server_b_times <- rgamma(150, shape = 2.5, rate = 0.012)  # New config

ab_result <- ks.test(server_a_times, server_b_times)
print(ab_result)

# Calculate practical metrics alongside
cat("\nServer A - Mean:", round(mean(server_a_times), 1), "ms\n")
cat("Server B - Mean:", round(mean(server_b_times), 1), "ms\n")
cat("Distribution difference detected:", ab_result$p.value < 0.05, "\n")

Visualizing K-S Test Results

Numbers tell part of the story; visualization tells the rest. Plotting empirical CDFs makes the D statistic intuitive—it’s literally the largest vertical gap between the curves.

# Create ECDF plot comparing two samples
library(ggplot2)

# Combine data for plotting
plot_data <- data.frame(
  value = c(normal_data, comparison_data),
  group = rep(c("Sample A", "Sample B"), each = 100)
)

# Calculate D statistic location for annotation
ecdf_a <- ecdf(normal_data)
ecdf_b <- ecdf(comparison_data)
all_values <- sort(unique(c(normal_data, comparison_data)))
differences <- abs(ecdf_a(all_values) - ecdf_b(all_values))
max_diff_idx <- which.max(differences)
max_diff_x <- all_values[max_diff_idx]
max_diff_y1 <- ecdf_a(max_diff_x)
max_diff_y2 <- ecdf_b(max_diff_x)

# Create the plot
ggplot(plot_data, aes(x = value, color = group)) +
  stat_ecdf(linewidth = 1) +
  geom_segment(
    aes(x = max_diff_x, xend = max_diff_x, 
        y = max_diff_y1, yend = max_diff_y2),
    color = "red", linewidth = 1.5, linetype = "dashed"
  ) +
  annotate("text", x = max_diff_x + 3, y = (max_diff_y1 + max_diff_y2) / 2,
           label = paste("D =", round(max(differences), 3)),
           color = "red", fontface = "bold") +
  labs(
    title = "Empirical CDFs with K-S D Statistic",
    x = "Value",
    y = "Cumulative Probability",
    color = "Sample"
  ) +
  theme_minimal() +
  theme(legend.position = "bottom")

For one-sample tests, overlay the theoretical CDF:

# One-sample visualization
ggplot(data.frame(x = normal_data), aes(x = x)) +
  stat_ecdf(aes(color = "Empirical"), linewidth = 1) +
  stat_function(
    fun = pnorm, 
    args = list(mean = 50, sd = 10),
    aes(color = "Theoretical Normal"),
    linewidth = 1
  ) +
  labs(
    title = "Sample vs. Theoretical Normal Distribution",
    x = "Value",
    y = "Cumulative Probability",
    color = "Distribution"
  ) +
  theme_minimal()

Limitations and Common Pitfalls

The K-S test has several gotchas that trip up practitioners.

Parameter estimation bias: When you estimate distribution parameters from your data (like using mean(x) and sd(x) for a normality test), the standard K-S test produces anti-conservative p-values. Use the Lilliefors test instead:

library(nortest)

# Wrong approach - inflated p-values
wrong_result <- ks.test(normal_data, "pnorm", 
                         mean = mean(normal_data), 
                         sd = sd(normal_data))

# Correct approach - Lilliefors test
correct_result <- lillie.test(normal_data)

cat("Standard K-S p-value:", wrong_result$p.value, "\n")
cat("Lilliefors p-value:", correct_result$p.value, "\n")

Ties in data: The K-S test assumes continuous distributions. Tied values (duplicates) violate this assumption:

# Data with ties triggers a warning
discrete_like <- round(rnorm(100, 50, 10))
ks.test(discrete_like, "pnorm", mean = 50, sd = 10)

# Suppress warning and use asymptotic p-value
ks.test(discrete_like, "pnorm", mean = 50, sd = 10, exact = FALSE)

Sample size sensitivity: With large samples, the test detects trivial differences. With small samples, it lacks power to detect real differences. Always pair statistical significance with effect size considerations.

# Large sample detects tiny difference
set.seed(42)
large_a <- rnorm(10000, mean = 0, sd = 1)
large_b <- rnorm(10000, mean = 0.05, sd = 1)  # Barely different

ks.test(large_a, large_b)  # Likely significant despite trivial difference

Practical Example: End-to-End Workflow

Let’s work through a complete analysis. You’re comparing API response times before and after a performance optimization.

# Complete K-S test workflow for response time analysis
library(ggplot2)
library(nortest)

# Simulated response time data (milliseconds)
set.seed(2024)
before_optimization <- c(
  rlnorm(200, meanlog = 5, sdlog = 0.5),
  rlnorm(20, meanlog = 6, sdlog = 0.3)  # Some slow requests
)
after_optimization <- rlnorm(220, meanlog = 4.8, sdlog = 0.4)

# Step 1: Exploratory summary
cat("=== Response Time Summary ===\n")
cat("Before - Mean:", round(mean(before_optimization), 1), "ms,",
    "Median:", round(median(before_optimization), 1), "ms\n")
cat("After  - Mean:", round(mean(after_optimization), 1), "ms,",
    "Median:", round(median(after_optimization), 1), "ms\n\n")

# Step 2: Test if distributions are normal (they probably aren't)
cat("=== Normality Tests (Lilliefors) ===\n")
before_normal <- lillie.test(before_optimization)
after_normal <- lillie.test(after_optimization)
cat("Before optimization - p-value:", format(before_normal$p.value, digits = 3), "\n")
cat("After optimization  - p-value:", format(after_normal$p.value, digits = 3), "\n\n")

# Step 3: Two-sample K-S test
cat("=== Two-Sample K-S Test ===\n")
ks_result <- ks.test(before_optimization, after_optimization)
print(ks_result)

# Step 4: Interpretation
cat("\n=== Interpretation ===\n")
if (ks_result$p.value < 0.05) {
  cat("The distributions differ significantly (p =", 
      format(ks_result$p.value, digits = 3), ")\n")
  cat("D statistic:", round(ks_result$statistic, 3), "\n")
  cat("The optimization changed the response time distribution.\n")
} else {
  cat("No significant difference detected between distributions.\n")
}

# Step 5: Visualization
plot_df <- data.frame(
  time = c(before_optimization, after_optimization),
  period = rep(c("Before", "After"), c(length(before_optimization), 
                                        length(after_optimization)))
)

ggplot(plot_df, aes(x = time, color = period)) +
  stat_ecdf(linewidth = 1.2) +
  scale_x_log10() +
  labs(
    title = "API Response Time Distribution: Before vs. After Optimization",
    subtitle = paste("K-S Test: D =", round(ks_result$statistic, 3), 
                     ", p =", format(ks_result$p.value, digits = 3)),
    x = "Response Time (ms, log scale)",
    y = "Cumulative Probability",
    color = "Period"
  ) +
  theme_minimal() +
  theme(legend.position = "bottom")

This workflow gives you a complete picture: summary statistics for context, normality checks to justify using nonparametric methods, the K-S test result, and a visualization that makes the distributional shift immediately apparent.

The K-S test won’t tell you everything about your distributions, but it provides a rigorous, assumption-light method for detecting distributional differences. Combine it with visualizations and domain knowledge for actionable insights.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.