How to Add Annotations in ggplot2
A chart without annotations is like a map without labels—technically complete but practically useless. Raw data visualizations force readers to hunt for insights. Good annotations direct attention to...
Key Insights
- Use
annotate()for single annotations andgeom_text()for data-driven labels—mixing them up is the most common beginner mistake - The
ggrepelpackage solves overlapping label problems automatically and should be your default choice for labeling multiple points - Annotations are layered in the order they’re added to your plot, so add background elements first and text last for proper visibility
Introduction to Plot Annotations
A chart without annotations is like a map without labels—technically complete but practically useless. Raw data visualizations force readers to hunt for insights. Good annotations direct attention to what matters, provide context, and transform your plot from a data dump into a communication tool.
The difference is stark. A scatter plot showing sales data becomes meaningful when you label the outlier representing your biggest customer win. A time series gains context when you mark the product launch date. A distribution makes sense when you highlight the median and quartiles.
Here’s a simple example showing the impact:
library(ggplot2)
# Sample data
sales_data <- data.frame(
month = 1:12,
revenue = c(45, 52, 48, 65, 70, 85, 90, 88, 95, 110, 105, 120)
)
# Basic plot
p1 <- ggplot(sales_data, aes(x = month, y = revenue)) +
geom_line() +
geom_point() +
theme_minimal()
# Annotated version
p2 <- p1 +
annotate("rect", xmin = 6, xmax = 7, ymin = -Inf, ymax = Inf,
alpha = 0.2, fill = "blue") +
annotate("text", x = 6.5, y = 120,
label = "New product\nlaunched", size = 3.5) +
geom_hline(yintercept = mean(sales_data$revenue),
linetype = "dashed", color = "red") +
annotate("text", x = 11, y = mean(sales_data$revenue) + 5,
label = "Average", color = "red", size = 3)
The annotated version tells a story. The highlighted region shows when something changed. The reference line provides context. This is what we’re building toward.
Text Annotations with annotate() and geom_text()
The fundamental choice for text annotations is between annotate() and geom_text(). Get this wrong and you’ll fight with your code.
Use annotate() when you’re adding one or a few manual annotations at specific coordinates. It’s explicit and doesn’t require a separate data frame.
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
annotate("text", x = 4, y = 30,
label = "Lightweight cars\nget better mileage",
hjust = 0, size = 4, color = "darkblue") +
theme_minimal()
The hjust parameter controls horizontal justification (0 = left, 0.5 = center, 1 = right). Use vjust for vertical alignment.
Use geom_text() when your labels come from your data. This is for labeling points based on a column in your dataset.
# Label cars with exceptional MPG
exceptional_cars <- mtcars[mtcars$mpg > 30, ]
exceptional_cars$car_name <- rownames(exceptional_cars)
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point(alpha = 0.5) +
geom_point(data = exceptional_cars, color = "red", size = 3) +
geom_text(data = exceptional_cars,
aes(label = car_name),
hjust = -0.1, size = 3) +
theme_minimal()
For styling, all standard text aesthetics work:
annotate("text", x = 3, y = 25,
label = "Important note",
size = 5, # Text size
color = "red", # Text color
fontface = "bold", # bold, italic, bold.italic
family = "serif", # Font family
angle = 45) # Rotation angle
Geometric Annotations (Shapes and Lines)
Reference lines are the simplest and most useful geometric annotations. Add them with geom_vline(), geom_hline(), and geom_abline().
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
geom_hline(yintercept = mean(mtcars$mpg),
linetype = "dashed", color = "red", size = 1) +
geom_vline(xintercept = median(mtcars$wt),
linetype = "dotted", color = "blue", size = 1) +
theme_minimal()
To highlight regions, use rectangles:
ggplot(mtcars, aes(x = wt, y = mpg)) +
annotate("rect",
xmin = 3, xmax = 4,
ymin = 15, ymax = 25,
alpha = 0.2, fill = "yellow") +
geom_point() +
annotate("text", x = 3.5, y = 26,
label = "Target zone") +
theme_minimal()
Set xmin = -Inf or ymax = Inf to extend rectangles to plot edges.
Arrows and segments draw attention to specific features:
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
annotate("segment",
x = 5, xend = 5.4,
y = 25, yend = 15.2,
arrow = arrow(length = unit(0.3, "cm")),
color = "red", size = 1) +
annotate("text", x = 5, y = 26,
label = "Heaviest car", hjust = 0.5) +
theme_minimal()
For curved arrows, use annotate("curve") with the same parameters plus curvature:
annotate("curve",
x = 2, xend = 2.5,
y = 30, yend = 33,
arrow = arrow(length = unit(0.2, "cm")),
curvature = 0.3,
color = "darkgreen")
Advanced Labeling with ggrepel
The ggrepel package is non-negotiable for labeling multiple points. It automatically adjusts label positions to avoid overlaps.
library(ggrepel)
# Identify interesting cars
interesting_cars <- mtcars[mtcars$mpg > 25 | mtcars$hp > 200, ]
interesting_cars$name <- rownames(interesting_cars)
ggplot(mtcars, aes(x = hp, y = mpg)) +
geom_point(alpha = 0.3) +
geom_point(data = interesting_cars, color = "red", size = 2) +
geom_text_repel(
data = interesting_cars,
aes(label = name),
size = 3,
max.overlaps = Inf,
box.padding = 0.5,
point.padding = 0.3
) +
theme_minimal()
For labels with background boxes, use geom_label_repel():
ggplot(mtcars, aes(x = hp, y = mpg)) +
geom_point(alpha = 0.3) +
geom_label_repel(
data = interesting_cars,
aes(label = name),
size = 3,
fill = "lightyellow",
segment.color = "gray50",
segment.size = 0.5,
min.segment.length = 0 # Always show connector lines
) +
theme_minimal()
Control the repulsion force and behavior:
geom_text_repel(
data = interesting_cars,
aes(label = name),
force = 10, # Repulsion strength
force_pull = 2, # Attraction to data point
max.iter = 10000, # Maximum iterations
direction = "y", # Only repel vertically
nudge_x = 0.5, # Shift all labels right
seed = 42 # Reproducible positioning
)
Mathematical Expressions and Special Characters
For mathematical notation, use expression() or plotmath syntax:
ggplot(data.frame(x = 1:10, y = (1:10)^2), aes(x, y)) +
geom_line() +
annotate("text", x = 5, y = 80,
label = "y == x^2",
parse = TRUE, size = 6) +
theme_minimal()
The parse = TRUE argument tells ggplot to interpret the string as a mathematical expression. Common operators:
# Greek letters
annotate("text", x = 5, y = 50,
label = "alpha == 0.05", parse = TRUE)
# Fractions
annotate("text", x = 5, y = 40,
label = "frac(x, y)", parse = TRUE)
# Subscripts and superscripts
annotate("text", x = 5, y = 30,
label = "R^2 == 0.95", parse = TRUE)
Combine regular text with expressions using paste():
annotate("text", x = 5, y = 60,
label = "paste('Correlation: ', rho, ' = 0.8')",
parse = TRUE)
Practical Tips and Best Practices
Layer order matters. Annotations are drawn in the order you add them. Add background elements first:
ggplot(mtcars, aes(x = wt, y = mpg)) +
annotate("rect", xmin = 3, xmax = 4, ymin = -Inf, ymax = Inf,
alpha = 0.2, fill = "gray") + # Background first
geom_point() + # Data in middle
annotate("text", x = 3.5, y = 30, # Text on top
label = "Zone of interest") +
theme_minimal()
Understand coordinate systems. By default, annotations use data coordinates. If you zoom with coord_cartesian(), annotations stay with your data:
p <- ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
annotate("text", x = 3, y = 30, label = "Fixed position") +
theme_minimal()
# Annotation stays at x=3, y=30 even when zoomed
p + coord_cartesian(xlim = c(2, 4), ylim = c(15, 35))
Keep it readable. Annotations should clarify, not clutter:
- Limit text annotations to 3-5 per plot
- Use color sparingly—too many colors create confusion
- Ensure sufficient contrast between text and background
- Test your plot at the size it will be displayed
Complete example combining multiple techniques:
library(ggplot2)
library(ggrepel)
# Create sample data
set.seed(42)
quarterly_data <- data.frame(
quarter = rep(paste0("Q", 1:4), 3),
year = rep(2021:2023, each = 4),
revenue = c(100, 105, 110, 115, 120, 125, 135, 140, 145, 155, 165, 180),
target = 150
)
quarterly_data$period <- paste(quarterly_data$year, quarterly_data$quarter)
# Identify key points
key_points <- quarterly_data[quarterly_data$revenue %in% c(100, 180), ]
ggplot(quarterly_data, aes(x = 1:12, y = revenue)) +
# Background highlight for target achievement
annotate("rect", xmin = 9, xmax = 12, ymin = -Inf, ymax = Inf,
alpha = 0.1, fill = "green") +
# Target line
geom_hline(yintercept = 150, linetype = "dashed",
color = "red", size = 0.8) +
# Main data
geom_line(size = 1.2, color = "steelblue") +
geom_point(size = 3, color = "steelblue") +
# Target label
annotate("text", x = 11, y = 155,
label = "Revenue Target", color = "red", size = 3.5) +
# Achievement zone label
annotate("text", x = 10.5, y = 95,
label = "Target\nAchieved", color = "darkgreen",
fontface = "bold", size = 4) +
# Label key points
geom_label_repel(data = key_points,
aes(label = paste0("$", revenue, "M")),
size = 3, fill = "lightyellow") +
labs(title = "Quarterly Revenue Growth",
x = "Quarter", y = "Revenue ($M)") +
theme_minimal() +
theme(plot.title = element_text(face = "bold", size = 14))
This example shows proper layering, combines multiple annotation types, and maintains readability. Your annotations should always serve the story you’re telling with your data.