How to Plot the ROC Curve in R

Key Insights

ROC curves visualize the trade-off between true positive and false positive rates across all classification thresholds, with AUC providing a single metric for model comparison (0.5 = random, 1.0 = perfect)
The pROC package offers the fastest path to ROC plotting in R with one-line curve generation, while ggplot2 provides publication-quality customization for professional presentations
Always plot multiple models together for meaningful comparison, but be cautious with imbalanced datasets where precision-recall curves may be more informative than ROC curves

Introduction to ROC Curves

The Receiver Operating Characteristic (ROC) curve is the gold standard for evaluating binary classification models. It plots the True Positive Rate (sensitivity) against the False Positive Rate (1 - specificity) across all possible classification thresholds. Unlike accuracy, which collapses model performance into a single number at one threshold, ROC curves show you the full spectrum of trade-offs your model can make.

The Area Under the Curve (AUC) summarizes this performance into a single metric. An AUC of 0.5 means your model performs no better than random guessing—you might as well flip a coin. An AUC of 1.0 represents perfect classification. In practice, most good models fall between 0.7 and 0.95, depending on the problem difficulty.

ROC curves matter because they’re threshold-independent. You can build your model once, then adjust the classification threshold based on your business requirements without retraining. Need to catch every possible fraud case? Move your threshold to maximize sensitivity. Want to minimize false alarms? Optimize for specificity instead.

Prerequisites and Setup

You’ll need three packages for comprehensive ROC analysis in R. The pROC package is your workhorse for calculations, ggplot2 handles beautiful visualizations, and caret helps with model building if you’re starting from scratch.

# Install packages (run once)
install.packages("pROC")
install.packages("ggplot2")
install.packages("caret")

# Load libraries
library(pROC)
library(ggplot2)
library(caret)

The pROC package is specifically designed for ROC analysis and includes functions for statistical comparison between curves. ggplot2 gives you pixel-perfect control over aesthetics. You can skip caret if you already have predictions from your model.

Preparing Sample Data

Let’s create realistic sample data that mimics a typical binary classification scenario. We’ll simulate a medical diagnosis problem with 500 patients, where we’re predicting disease presence.

# Set seed for reproducibility
set.seed(123)

# Generate sample data
n <- 500
actual_labels <- sample(c(0, 1), n, replace = TRUE, prob = c(0.7, 0.3))

# Simulate predicted probabilities
# Add noise to make it realistic - not perfect separation
predicted_probs <- ifelse(actual_labels == 1, 
                          rbeta(n, 5, 2),  # Higher probs for positive class
                          rbeta(n, 2, 5))  # Lower probs for negative class

# Create data frame
classification_data <- data.frame(
  actual = actual_labels,
  predicted = predicted_probs
)

# View first few rows
head(classification_data)

This code creates a dataset where positive cases tend to have higher predicted probabilities, but there’s overlap—exactly what you’d see with a real model that isn’t perfect. The beta distribution gives us realistic probability values between 0 and 1.

Creating a Basic ROC Curve with pROC

The pROC package makes basic ROC plotting almost trivial. You can generate a complete ROC curve with AUC in just a few lines.

# Create ROC object
roc_obj <- roc(classification_data$actual, 
               classification_data$predicted)

# Plot basic ROC curve
plot(roc_obj, 
     main = "ROC Curve - Basic",
     col = "#1c61b6",
     lwd = 2)

# Add AUC to the plot
text(0.5, 0.3, 
     paste("AUC =", round(auc(roc_obj), 3)),
     cex = 1.5)

# Print AUC value
print(paste("AUC:", round(auc(roc_obj), 3)))

The roc() function automatically handles all the threshold calculations. It tries every unique predicted probability as a threshold and computes sensitivity and specificity at each point. The resulting curve shows you exactly how your model performs across the entire operating range.

The diagonal line from (0,0) to (1,1) represents random guessing. Your curve should bow toward the upper-left corner. The further from the diagonal, the better your model.

Customizing ROC Plots with ggplot2

For publication-quality figures or presentations, you’ll want more control over aesthetics. Here’s how to build a professional ROC curve with ggplot2.

# Extract ROC curve coordinates
roc_data <- data.frame(
  sensitivity = roc_obj$sensitivities,
  specificity = roc_obj$specificities,
  threshold = roc_obj$thresholds
)

# Calculate 1 - specificity (FPR)
roc_data$fpr <- 1 - roc_data$specificity

# Create polished ggplot
ggplot(roc_data, aes(x = fpr, y = sensitivity)) +
  geom_line(color = "#1c61b6", size = 1.2) +
  geom_abline(intercept = 0, slope = 1, 
              linetype = "dashed", color = "gray50") +
  annotate("text", x = 0.7, y = 0.3, 
           label = paste("AUC =", round(auc(roc_obj), 3)),
           size = 5) +
  labs(
    title = "ROC Curve - Model Performance",
    x = "False Positive Rate (1 - Specificity)",
    y = "True Positive Rate (Sensitivity)"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 14, face = "bold"),
    axis.title = element_text(size = 12),
    axis.text = element_text(size = 10)
  ) +
  coord_fixed()

# Save plot
ggsave("roc_curve.png", width = 8, height = 6, dpi = 300)

The coord_fixed() ensures your plot has equal scaling on both axes, which is important for ROC curves since both axes represent rates from 0 to 1. The dashed diagonal reference line makes it immediately obvious whether your model beats random chance.

Comparing Multiple Models

The real power of ROC curves emerges when comparing multiple models. Let’s simulate predictions from three different algorithms.

# Simulate predictions from three models
set.seed(456)

# Model 1: Strong model
pred_model1 <- predicted_probs

# Model 2: Moderate model (more noise)
pred_model2 <- ifelse(actual_labels == 1,
                      rbeta(n, 3, 3),
                      rbeta(n, 3, 4))

# Model 3: Weak model (lots of noise)
pred_model3 <- ifelse(actual_labels == 1,
                      rbeta(n, 2, 2),
                      rbeta(n, 2, 2.5))

# Create ROC objects for all models
roc1 <- roc(actual_labels, pred_model1)
roc2 <- roc(actual_labels, pred_model2)
roc3 <- roc(actual_labels, pred_model3)

# Plot all curves together
plot(roc1, col = "#1c61b6", lwd = 2, main = "Model Comparison")
plot(roc2, col = "#e74c3c", lwd = 2, add = TRUE)
plot(roc3, col = "#27ae60", lwd = 2, add = TRUE)

# Add legend
legend("bottomright", 
       legend = c(
         paste("Model 1 (AUC =", round(auc(roc1), 3), ")"),
         paste("Model 2 (AUC =", round(auc(roc2), 3), ")"),
         paste("Model 3 (AUC =", round(auc(roc3), 3), ")")
       ),
       col = c("#1c61b6", "#e74c3c", "#27ae60"),
       lwd = 2)

For a ggplot2 version with better control:

# Combine ROC data
roc_comparison <- rbind(
  data.frame(fpr = 1 - roc1$specificities, 
             tpr = roc1$sensitivities, 
             model = "Model 1"),
  data.frame(fpr = 1 - roc2$specificities, 
             tpr = roc2$sensitivities, 
             model = "Model 2"),
  data.frame(fpr = 1 - roc3$specificities, 
             tpr = roc3$sensitivities, 
             model = "Model 3")
)

# Plot comparison
ggplot(roc_comparison, aes(x = fpr, y = tpr, color = model)) +
  geom_line(size = 1.2) +
  geom_abline(intercept = 0, slope = 1, 
              linetype = "dashed", color = "gray50") +
  scale_color_manual(
    values = c("#1c61b6", "#e74c3c", "#27ae60"),
    labels = c(
      paste("Model 1 (AUC =", round(auc(roc1), 3), ")"),
      paste("Model 2 (AUC =", round(auc(roc2), 3), ")"),
      paste("Model 3 (AUC =", round(auc(roc3), 3), ")")
    )
  ) +
  labs(
    title = "ROC Curve Comparison",
    x = "False Positive Rate",
    y = "True Positive Rate",
    color = "Model"
  ) +
  theme_minimal() +
  theme(legend.position = c(0.7, 0.3)) +
  coord_fixed()

This visualization immediately shows which model performs best. Model 1’s curve hugs the upper-left corner more tightly than the others, reflected in its higher AUC.

Interpreting Results and Best Practices

When reading ROC curves, focus on three things: the curve’s proximity to the upper-left corner, the AUC value, and the shape of the curve. A curve that rises steeply then flattens indicates good separation between classes at lower thresholds. A curve that rises gradually throughout suggests poor separation.

Choose your operating threshold based on business requirements, not just AUC. If false positives are expensive (like unnecessary medical procedures), operate on the left side of the curve where FPR is low. If false negatives are catastrophic (missing a cancer diagnosis), operate higher on the curve where sensitivity is maximized.

Be cautious with imbalanced datasets. When you have 95% negative cases and 5% positive cases, a model that predicts everything as negative gets 95% accuracy but is useless. ROC curves can look deceptively good on imbalanced data because specificity (true negative rate) is easy to achieve when negatives dominate. In these cases, precision-recall curves often provide better insight.

Use statistical tests to compare models rigorously. The pROC package includes the roc.test() function for DeLong’s test, which tells you if the AUC difference between two models is statistically significant.

# Test if Model 1 is significantly better than Model 2
roc.test(roc1, roc2)

Always validate on held-out test data. A ROC curve on training data tells you about overfitting, not generalization. Cross-validation with ROC curves at each fold gives you confidence intervals around your AUC estimate.

ROC curves are indispensable for binary classification, but they’re just one tool. Combine them with confusion matrices, calibration plots, and domain expertise to build models that actually solve problems.