How to Create a Line Chart in ggplot2
Line charts excel at showing trends over continuous variables, particularly time. In ggplot2, creating line charts leverages the grammar of graphics—a systematic approach where you build...
Key Insights
- Line charts in ggplot2 require data in long format with one row per observation, making data preparation as important as the visualization code itself
- The
geom_line()layer automatically connects points in x-axis order, but you must explicitly specify grouping when plotting multiple series to avoid incorrect connections - Combining
geom_line()withgeom_point(), custom themes, and strategic use of color creates publication-ready visualizations that communicate trends effectively
Introduction to ggplot2 Line Charts
Line charts excel at showing trends over continuous variables, particularly time. In ggplot2, creating line charts leverages the grammar of graphics—a systematic approach where you build visualizations by layering geometric objects, aesthetic mappings, and scales.
Use line charts when you need to display changes in continuous data, compare trends across multiple groups, or highlight patterns in time series data. They’re particularly effective for showing stock prices, temperature changes, sales trends, or any metric that evolves continuously.
Here’s the basic setup:
library(ggplot2)
library(dplyr)
# Preview the built-in economics dataset
head(economics)
The economics dataset contains US economic time series data, perfect for demonstrating line chart techniques. It includes variables like unemployment, population, and personal savings rates over time.
Creating a Basic Line Chart
The fundamental ggplot2 syntax combines ggplot() to initialize the plot with geom_line() to draw the line. You map variables to aesthetics using aes():
ggplot(economics, aes(x = date, y = unemploy)) +
geom_line()
This creates a simple line chart showing unemployment over time. The structure is straightforward: ggplot() establishes the data and default aesthetic mappings, while geom_line() adds the geometric layer that renders the line.
Data format matters critically here. ggplot2 expects one row per observation. If your data is in wide format (multiple columns for different time points), reshape it to long format using tidyr::pivot_longer() before plotting.
For a more complete basic example:
ggplot(economics, aes(x = date, y = psavert)) +
geom_line() +
labs(
x = "Year",
y = "Personal Savings Rate (%)"
)
This chart tracks the personal savings rate, demonstrating the decline in American savings habits from the 1960s through the 2000s.
Customizing Line Appearance
Control line aesthetics through parameters in geom_line(). The most common customizations involve color, size, and line type:
# Colored line with custom width
ggplot(economics, aes(x = date, y = unemploy)) +
geom_line(color = "#2C3E50", size = 1.2)
# Dashed line
ggplot(economics, aes(x = date, y = unemploy)) +
geom_line(linetype = "dashed", color = "steelblue", size = 0.8)
# Line with points
ggplot(economics, aes(x = date, y = unemploy)) +
geom_line(color = "darkred", size = 0.8) +
geom_point(color = "darkred", size = 1.5)
The linetype parameter accepts values like “solid”, “dashed”, “dotted”, “dotdash”, and “longdash”. The size parameter controls line thickness (note: in newer ggplot2 versions, use linewidth instead of size for lines).
For subtle variations, adjust transparency with alpha:
ggplot(economics, aes(x = date, y = unemploy)) +
geom_line(color = "navy", size = 1, alpha = 0.7)
Alpha values range from 0 (fully transparent) to 1 (fully opaque). This proves useful when overlaying multiple elements or creating layered visualizations.
Multiple Lines on One Chart
Comparing multiple series requires proper grouping. Map the grouping variable to the color aesthetic, and ggplot2 handles the rest:
# Create a subset with multiple variables
econ_long <- economics %>%
select(date, unemploy, pop) %>%
tidyr::pivot_longer(
cols = c(unemploy, pop),
names_to = "metric",
values_to = "value"
)
# This will create incorrect connections
ggplot(econ_long, aes(x = date, y = value)) +
geom_line()
# Correct approach with grouping
ggplot(econ_long, aes(x = date, y = value, color = metric)) +
geom_line(size = 1)
Without the color or group aesthetic, ggplot2 doesn’t know where one series ends and another begins, creating nonsensical connections between unrelated points.
For a more practical example comparing normalized trends:
# Normalize to percentage of maximum for fair comparison
econ_normalized <- economics %>%
select(date, unemploy, psavert, uempmed) %>%
mutate(
unemploy = unemploy / max(unemploy) * 100,
psavert = psavert / max(psavert) * 100,
uempmed = uempmed / max(uempmed) * 100
) %>%
tidyr::pivot_longer(
cols = -date,
names_to = "indicator",
values_to = "pct_of_max"
)
ggplot(econ_normalized, aes(x = date, y = pct_of_max, color = indicator)) +
geom_line(size = 1) +
labs(
x = "Year",
y = "Percent of Maximum Value",
color = "Economic Indicator"
)
This normalization technique allows meaningful comparison of variables with different scales.
Enhancing with Labels and Themes
Professional visualizations require clear labels and thoughtful styling:
ggplot(economics, aes(x = date, y = unemploy / 1000)) +
geom_line(color = "#E74C3C", size = 1.2) +
labs(
title = "US Unemployment Over Time",
subtitle = "Total unemployed in thousands",
x = NULL,
y = "Unemployed (thousands)",
caption = "Source: economics dataset"
) +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 16),
plot.subtitle = element_text(color = "gray40"),
axis.text = element_text(size = 10)
)
Built-in themes provide instant polish: theme_minimal(), theme_bw(), theme_classic(), and theme_light() each offer distinct aesthetics. I prefer theme_minimal() for its clean, modern look.
Control axis formatting with scale functions:
ggplot(economics, aes(x = date, y = unemploy)) +
geom_line(color = "steelblue", size = 1) +
scale_y_continuous(
labels = scales::comma,
breaks = seq(0, 16000, 2000),
limits = c(0, 16000)
) +
scale_x_date(
date_breaks = "5 years",
date_labels = "%Y"
) +
theme_minimal()
The scales package provides helpful formatting functions like comma(), dollar(), and percent().
Common Use Cases and Advanced Techniques
Real-world line charts often require annotations, reference lines, or faceting. Here’s a time series with highlighted regions:
# Add a recession indicator
economics_annotated <- economics %>%
mutate(recession = date >= as.Date("2007-12-01") & date <= as.Date("2009-06-01"))
ggplot(economics_annotated, aes(x = date, y = unemploy)) +
geom_rect(
data = filter(economics_annotated, recession),
aes(xmin = as.Date("2007-12-01"), xmax = as.Date("2009-06-01"),
ymin = -Inf, ymax = Inf),
fill = "gray80", alpha = 0.5, inherit.aes = FALSE
) +
geom_line(color = "darkred", size = 1) +
annotate(
"text", x = as.Date("2008-06-01"), y = 14000,
label = "Great Recession", fontface = "bold"
) +
labs(
title = "Unemployment Spike During Recession",
x = NULL,
y = "Unemployed (thousands)"
) +
theme_minimal()
For comparing across categories, use faceting:
# Create sample data with multiple categories
sample_data <- data.frame(
date = rep(seq(as.Date("2020-01-01"), as.Date("2023-12-31"), by = "month"), 3),
value = c(cumsum(rnorm(48, 5, 2)), cumsum(rnorm(48, 3, 1.5)), cumsum(rnorm(48, 7, 3))),
category = rep(c("Product A", "Product B", "Product C"), each = 48)
)
ggplot(sample_data, aes(x = date, y = value)) +
geom_line(color = "steelblue", size = 1) +
facet_wrap(~category, ncol = 1, scales = "free_y") +
theme_minimal() +
labs(title = "Sales Trends by Product", x = NULL, y = "Sales")
Adding trend lines reveals underlying patterns:
ggplot(economics, aes(x = date, y = psavert)) +
geom_line(color = "gray60", size = 0.8) +
geom_smooth(method = "loess", color = "darkblue", size = 1.2, se = TRUE) +
labs(
title = "Personal Savings Rate with Trend Line",
x = NULL,
y = "Savings Rate (%)"
) +
theme_minimal()
Best Practices and Conclusion
Effective line charts follow these principles:
Keep it simple. Don’t plot more than 5-7 lines on a single chart. Beyond that, consider faceting or interactive visualizations.
Order matters. ggplot2 connects points in the order they appear in your data. Always arrange your data by the x-axis variable before plotting.
Choose colors deliberately. Use color to convey meaning, not just decoration. The RColorBrewer and viridis packages provide colorblind-friendly palettes.
Label directly when possible. For charts with few lines, consider direct labels using geom_text() instead of legends, which require eye movement back and forth.
Mind your aspect ratio. The relationship between chart width and height affects trend perception. Banking to 45 degrees (where most line segments average 45-degree angles) often works well.
The key functions to remember:
ggplot()+aes(): Initialize plot and map aestheticsgeom_line(): Draw the line layerscale_*(): Control axis formatting and limitstheme_*(): Apply overall stylinglabs(): Add labels and titlesfacet_wrap()orfacet_grid(): Create small multiples
Line charts in ggplot2 balance simplicity with power. Master these fundamentals, and you’ll create clear, compelling visualizations that communicate your data’s story effectively.