R ggplot2 - Bar Plot with Examples

ggplot2 creates bar plots through two primary geoms: `geom_bar()` and `geom_col()`. Understanding their difference prevents common confusion. `geom_bar()` counts observations by default, while...

Key Insights

  • Bar plots in ggplot2 use geom_bar() for count-based data and geom_col() for pre-computed values, with geom_col() being more explicit and often preferred for direct value mapping
  • Position adjustments (dodge, stack, fill) control how grouped bars display, while coordinate flips and reordering transform bar plots for better readability
  • Advanced customization through themes, color scales, and faceting enables production-ready visualizations that communicate data patterns effectively

Basic Bar Plot Structure

ggplot2 creates bar plots through two primary geoms: geom_bar() and geom_col(). Understanding their difference prevents common confusion. geom_bar() counts observations by default, while geom_col() plots values directly from your data.

library(ggplot2)
library(dplyr)

# Sample dataset
sales_data <- data.frame(
  product = c("A", "B", "C", "D", "E"),
  revenue = c(45000, 62000, 38000, 71000, 54000)
)

# Using geom_col() for direct values
ggplot(sales_data, aes(x = product, y = revenue)) +
  geom_col()

# Using geom_bar() with count data
customer_data <- data.frame(
  category = rep(c("Premium", "Standard", "Basic"), c(45, 78, 32))
)

ggplot(customer_data, aes(x = category)) +
  geom_bar()

The aesthetic mapping aes() defines which variables map to visual properties. For bar plots, x-axis typically represents categories, while y-axis shows values or counts.

Grouped and Stacked Bars

Real-world data often requires comparing multiple variables across categories. Position adjustments handle this through position_dodge(), position_stack(), or position_fill().

# Multi-category dataset
quarterly_sales <- data.frame(
  quarter = rep(c("Q1", "Q2", "Q3", "Q4"), each = 3),
  region = rep(c("North", "South", "West"), 4),
  sales = c(120, 95, 110, 135, 108, 125, 
            142, 118, 138, 155, 132, 148)
)

# Grouped bars (side-by-side)
ggplot(quarterly_sales, aes(x = quarter, y = sales, fill = region)) +
  geom_col(position = "dodge") +
  labs(title = "Sales by Quarter and Region",
       y = "Sales (thousands)",
       fill = "Region")

# Stacked bars
ggplot(quarterly_sales, aes(x = quarter, y = sales, fill = region)) +
  geom_col(position = "stack")

# Proportional stacking (100% bars)
ggplot(quarterly_sales, aes(x = quarter, y = sales, fill = region)) +
  geom_col(position = "fill") +
  scale_y_continuous(labels = scales::percent)

Dodge positioning works best when comparing absolute values across groups. Stacked bars show total composition, while fill positioning emphasizes proportional relationships.

Horizontal Bars and Reordering

Horizontal bars improve readability for long category names and create natural reading flow. Reordering bars by value highlights patterns immediately.

# Create dataset with longer names
tech_stack <- data.frame(
  technology = c("PostgreSQL", "Redis", "Elasticsearch", 
                 "MongoDB", "Cassandra", "Neo4j"),
  usage_score = c(87, 92, 78, 65, 54, 43)
)

# Horizontal bars with reordering
ggplot(tech_stack, aes(x = usage_score, 
                       y = reorder(technology, usage_score))) +
  geom_col(fill = "steelblue") +
  labs(x = "Usage Score", y = NULL,
       title = "Technology Adoption Scores") +
  theme_minimal()

# Reorder with factor levels for custom order
priority_order <- c("MongoDB", "Cassandra", "Neo4j", 
                    "Elasticsearch", "Redis", "PostgreSQL")

tech_stack$technology <- factor(tech_stack$technology, 
                                levels = priority_order)

ggplot(tech_stack, aes(x = technology, y = usage_score)) +
  geom_col(fill = "coral") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

The reorder() function sorts categories by another variable. For descending order, use reorder(category, -value).

Color Customization and Scales

Color communicates meaning beyond mere aesthetics. Manual scales provide precise control, while palette packages offer professional color schemes.

# Manual color assignment
performance_data <- data.frame(
  metric = c("Latency", "Throughput", "Error Rate", "CPU Usage"),
  value = c(85, 92, 3, 67),
  status = c("good", "good", "bad", "warning")
)

status_colors <- c("good" = "#2ecc71", 
                   "bad" = "#e74c3c", 
                   "warning" = "#f39c12")

ggplot(performance_data, aes(x = metric, y = value, fill = status)) +
  geom_col() +
  scale_fill_manual(values = status_colors) +
  theme_minimal()

# Using viridis color scale for continuous values
gradient_data <- data.frame(
  service = paste("Service", 1:8),
  response_time = c(120, 250, 180, 95, 310, 145, 220, 175)
)

ggplot(gradient_data, aes(x = service, y = response_time, 
                          fill = response_time)) +
  geom_col() +
  scale_fill_viridis_c(option = "plasma") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

For categorical data, use scale_fill_manual() or scale_fill_brewer(). Continuous values work with scale_fill_gradient() or viridis scales.

Adding Labels and Annotations

Data labels transform bar plots from visual representations to precise data communication tools.

# Bar labels with geom_text
monthly_metrics <- data.frame(
  month = month.abb[1:6],
  conversions = c(1247, 1389, 1523, 1456, 1678, 1734)
)

ggplot(monthly_metrics, aes(x = month, y = conversions)) +
  geom_col(fill = "dodgerblue", alpha = 0.8) +
  geom_text(aes(label = conversions), 
            vjust = -0.5, 
            size = 3.5) +
  ylim(0, 2000) +
  theme_minimal() +
  labs(title = "Monthly Conversions",
       y = "Number of Conversions",
       x = NULL)

# Labels inside bars for horizontal plots
ggplot(tech_stack, aes(x = usage_score, 
                       y = reorder(technology, usage_score))) +
  geom_col(fill = "steelblue") +
  geom_text(aes(label = usage_score), 
            hjust = 1.2, 
            color = "white",
            fontface = "bold") +
  theme_minimal()

Position labels with vjust (vertical) and hjust (horizontal) adjustments. Values above 1 move labels outside the bar.

Faceting for Multi-Dimensional Data

Faceting splits data into multiple plots, revealing patterns across additional dimensions without cluttering single visualizations.

# Complex dataset
api_metrics <- data.frame(
  endpoint = rep(c("/users", "/orders", "/products", "/auth"), each = 12),
  month = rep(month.abb[1:6], each = 2, times = 4),
  metric_type = rep(c("Requests", "Errors"), 24),
  value = c(
    15000, 45, 16200, 38, 17500, 52, 18900, 41, 19800, 48, 21000, 55,
    8500, 12, 9200, 15, 9800, 18, 10500, 14, 11200, 19, 12000, 21,
    22000, 78, 23500, 82, 25000, 95, 26800, 88, 28500, 102, 30000, 110,
    5000, 8, 5400, 6, 5800, 9, 6200, 7, 6600, 11, 7000, 10
  )
)

ggplot(api_metrics, aes(x = month, y = value, fill = metric_type)) +
  geom_col(position = "dodge") +
  facet_wrap(~endpoint, scales = "free_y", ncol = 2) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title = "API Metrics by Endpoint",
       y = "Count",
       x = NULL,
       fill = "Metric")

Use facet_wrap() for one variable or facet_grid() for two variables. The scales parameter controls axis independence across facets.

Error Bars and Confidence Intervals

Statistical visualizations require uncertainty representation through error bars or confidence intervals.

# Dataset with error margins
experiment_results <- data.frame(
  condition = c("Control", "Variant A", "Variant B", "Variant C"),
  mean_value = c(45.2, 52.8, 48.6, 56.3),
  std_error = c(2.1, 2.8, 2.3, 3.1)
)

ggplot(experiment_results, aes(x = condition, y = mean_value)) +
  geom_col(fill = "steelblue", alpha = 0.7) +
  geom_errorbar(aes(ymin = mean_value - std_error, 
                    ymax = mean_value + std_error),
                width = 0.2,
                linewidth = 0.8) +
  theme_minimal() +
  labs(title = "A/B Test Results with Standard Error",
       y = "Conversion Rate (%)",
       x = "Test Condition")

Error bars communicate data reliability and support informed decision-making from visualizations.

Production-Ready Styling

Professional visualizations require consistent theming and careful attention to typography, spacing, and color harmony.

# Complete styled example
theme_custom <- theme_minimal() +
  theme(
    plot.title = element_text(size = 14, face = "bold", margin = margin(b = 10)),
    plot.subtitle = element_text(size = 10, color = "gray40", margin = margin(b = 15)),
    axis.title = element_text(size = 10, face = "bold"),
    axis.text = element_text(size = 9),
    legend.position = "top",
    legend.title = element_text(size = 10, face = "bold"),
    panel.grid.minor = element_blank(),
    panel.grid.major.x = element_blank()
  )

ggplot(quarterly_sales, aes(x = quarter, y = sales, fill = region)) +
  geom_col(position = "dodge", width = 0.7) +
  scale_fill_brewer(palette = "Set2") +
  labs(
    title = "Quarterly Sales Performance by Region",
    subtitle = "Year 2024 | Values in thousands USD",
    x = NULL,
    y = "Sales Revenue",
    fill = "Region"
  ) +
  theme_custom

This approach creates consistent, publication-ready visualizations that communicate data effectively while maintaining professional aesthetics.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.