R ggplot2 - Complete Tutorial with Examples

Install ggplot2 from CRAN or load it as part of the tidyverse:

Key Insights

  • ggplot2 implements the Grammar of Graphics, building plots through layered components (data, aesthetics, geometries, scales, and themes) rather than predefined chart types
  • The aes() function maps data variables to visual properties, while geom functions determine how those mappings are rendered as points, lines, bars, or other shapes
  • Faceting, statistical transformations, and coordinate systems provide advanced capabilities for multi-panel displays, data summaries, and specialized plot types

Installation and Basic Structure

Install ggplot2 from CRAN or load it as part of the tidyverse:

install.packages("ggplot2")
library(ggplot2)

# Or load the entire tidyverse
library(tidyverse)

Every ggplot2 visualization follows this structure:

ggplot(data = <DATA>) +
  <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))

The ggplot() function initializes the plot with a dataset. You then add layers using the + operator. Here’s a minimal example using the built-in mtcars dataset:

ggplot(data = mtcars) +
  geom_point(mapping = aes(x = wt, y = mpg))

This creates a scatter plot with car weight on the x-axis and miles per gallon on the y-axis.

Aesthetic Mappings

Aesthetics map data variables to visual properties. Common aesthetics include x, y, color, fill, size, alpha, shape, and linetype.

# Color points by number of cylinders
ggplot(mtcars) +
  geom_point(aes(x = wt, y = mpg, color = factor(cyl)))

# Size points by horsepower, shape by transmission type
ggplot(mtcars) +
  geom_point(aes(x = wt, y = mpg, size = hp, shape = factor(am)))

# Set fixed aesthetic (not mapped to data)
ggplot(mtcars) +
  geom_point(aes(x = wt, y = mpg), color = "blue", size = 3)

Aesthetics inside aes() are mapped to data variables. Aesthetics outside aes() are fixed values applied to all elements.

Common Geometries

ggplot2 provides over 40 geom functions. Here are the most frequently used:

# Scatter plot
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point()

# Line plot
ggplot(economics, aes(x = date, y = unemploy)) +
  geom_line()

# Bar plot (count)
ggplot(mtcars, aes(x = factor(cyl))) +
  geom_bar()

# Bar plot (specified values)
df <- data.frame(category = c("A", "B", "C"), value = c(23, 45, 12))
ggplot(df, aes(x = category, y = value)) +
  geom_col()

# Histogram
ggplot(mtcars, aes(x = mpg)) +
  geom_histogram(binwidth = 2)

# Box plot
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
  geom_boxplot()

# Smoothed conditional mean
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_smooth(method = "lm")

Layering Multiple Geoms

Combine multiple geoms to create complex visualizations:

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point(aes(color = factor(cyl)), size = 3) +
  geom_smooth(method = "lm", se = TRUE, color = "black") +
  geom_hline(yintercept = mean(mtcars$mpg), linetype = "dashed", color = "red")

Each geom can have its own data and aesthetic mappings:

# Calculate mean mpg by cylinder
mean_mpg <- aggregate(mpg ~ cyl, data = mtcars, FUN = mean)

ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
  geom_jitter(width = 0.2, alpha = 0.5) +
  geom_point(data = mean_mpg, aes(x = factor(cyl), y = mpg), 
             color = "red", size = 4, shape = 18)

Faceting

Facets split data into subplots based on categorical variables:

# Single variable faceting
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  facet_wrap(~ cyl)

# Two variable faceting
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  facet_grid(am ~ cyl)

# Control scales
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  facet_wrap(~ cyl, scales = "free_y")

Use scales = "free" to allow independent axis ranges for each facet.

Statistical Transformations

Many geoms perform statistical transformations on data before plotting:

# geom_bar counts observations
ggplot(mtcars, aes(x = factor(cyl))) +
  geom_bar()

# Equivalent using stat_count
ggplot(mtcars, aes(x = factor(cyl))) +
  stat_count(geom = "bar")

# Display proportions instead of counts
ggplot(mtcars, aes(x = factor(cyl), y = after_stat(prop), group = 1)) +
  geom_bar()

# Add statistical summary
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
  geom_jitter(width = 0.2, alpha = 0.3) +
  stat_summary(fun = mean, geom = "point", color = "red", size = 4) +
  stat_summary(fun.data = mean_se, geom = "errorbar", width = 0.2, color = "red")

Scales

Scales control how data values map to visual properties:

# Continuous color scale
ggplot(mtcars, aes(x = wt, y = mpg, color = hp)) +
  geom_point(size = 3) +
  scale_color_gradient(low = "blue", high = "red")

# Discrete color scale
ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point(size = 3) +
  scale_color_manual(values = c("4" = "#E41A1C", "6" = "#377EB8", "8" = "#4DAF4A"))

# Logarithmic scale
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  scale_y_log10()

# Custom axis limits and breaks
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  scale_x_continuous(limits = c(1, 6), breaks = seq(1, 6, 1)) +
  scale_y_continuous(limits = c(10, 35), breaks = seq(10, 35, 5))

Labels and Annotations

Add titles, labels, and text annotations:

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point(aes(color = factor(cyl))) +
  labs(
    title = "Fuel Efficiency vs. Weight",
    subtitle = "Data from 1974 Motor Trend",
    x = "Weight (1000 lbs)",
    y = "Miles per Gallon",
    color = "Cylinders",
    caption = "Source: mtcars dataset"
  )

# Add text annotations
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  annotate("text", x = 4, y = 30, label = "Lightweight cars", size = 5) +
  annotate("rect", xmin = 1.5, xmax = 3, ymin = 25, ymax = 35, 
           alpha = 0.2, fill = "blue")

Themes

Themes control non-data plot elements:

# Built-in themes
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  theme_minimal()

# Other themes: theme_bw(), theme_classic(), theme_dark(), theme_void()

# Customize theme elements
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 16, face = "bold"),
    axis.text = element_text(size = 12),
    legend.position = "bottom",
    panel.grid.minor = element_blank()
  )

Coordinate Systems

Transform the coordinate system for specialized plots:

# Flip coordinates
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
  geom_boxplot() +
  coord_flip()

# Polar coordinates for pie chart
df <- data.frame(category = c("A", "B", "C"), value = c(30, 50, 20))
ggplot(df, aes(x = "", y = value, fill = category)) +
  geom_col() +
  coord_polar(theta = "y")

# Fixed aspect ratio
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  coord_fixed(ratio = 0.1)

Saving Plots

Export plots with ggsave():

p <- ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point()

# Save last plot
ggsave("plot.png", width = 8, height = 6, dpi = 300)

# Save specific plot object
ggsave("plot.pdf", plot = p, width = 10, height = 8)

# Supported formats: png, pdf, svg, jpeg, tiff, eps

Practical Example: Multi-Panel Dashboard

Combining multiple techniques:

library(patchwork)

p1 <- ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point(size = 3) +
  geom_smooth(method = "lm", se = FALSE) +
  scale_color_brewer(palette = "Set1") +
  labs(title = "MPG vs Weight", color = "Cylinders") +
  theme_minimal()

p2 <- ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
  geom_boxplot(fill = "steelblue", alpha = 0.7) +
  labs(title = "MPG by Cylinder Count", x = "Cylinders", y = "MPG") +
  theme_minimal()

p3 <- ggplot(mtcars, aes(x = hp)) +
  geom_histogram(binwidth = 20, fill = "coral", color = "black") +
  labs(title = "Horsepower Distribution", x = "Horsepower", y = "Count") +
  theme_minimal()

# Combine plots
(p1 | p2) / p3

This creates a three-panel dashboard with scatter plot, box plot, and histogram arranged in a grid. The patchwork package simplifies complex layouts using intuitive operators.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.