R stringr - str_c() / str_glue() - Concatenate

String concatenation seems trivial until you're debugging why your data pipeline silently converted missing values into the literal string 'NA' and corrupted downstream processing. Base R's `paste()`...

Key Insights

  • str_c() handles NA values by propagating them (returning NA), while base R’s paste() silently converts them to the string “NA”—this behavior prevents subtle bugs in data pipelines
  • str_glue() provides Python f-string-style interpolation with {variable} syntax, making complex string construction readable and maintainable
  • Choose str_c() for programmatic concatenation of vectors and str_glue() for human-readable template strings with embedded expressions

Introduction

String concatenation seems trivial until you’re debugging why your data pipeline silently converted missing values into the literal string “NA” and corrupted downstream processing. Base R’s paste() function has served us for decades, but its permissive behavior around edge cases creates problems in production code.

The stringr package, part of the tidyverse ecosystem, provides two superior alternatives: str_c() for programmatic concatenation and str_glue() for template-based string construction. Both functions offer consistent behavior, better NA handling, and integration with the tidyverse’s philosophy of explicit, predictable operations.

This article covers both functions in depth, showing you when to reach for each and how to avoid common pitfalls.

str_c() Basics

The str_c() function concatenates strings with explicit control over separators and NA handling. Its signature is straightforward:

str_c(..., sep = "", collapse = NULL)

The sep parameter inserts a delimiter between each argument. The collapse parameter reduces a vector result to a single string—more on that shortly.

library(stringr)

# Basic concatenation
str_c("Hello", "World")
#> [1] "HelloWorld"

# With separator
str_c("Hello", "World", sep = " ")
#> [1] "Hello World"

# Multiple arguments
str_c("2024", "01", "15", sep = "-")
#> [1] "2024-01-15"

The critical difference from paste() appears with NA values:

name <- NA

# paste() silently converts NA to string
paste("Hello", name)
#> [1] "Hello NA"

# str_c() propagates NA
str_c("Hello ", name)
#> [1] NA

This propagation behavior is intentional and valuable. When you’re building identifiers or paths from potentially missing data, you want the operation to fail visibly rather than produce garbage output. If you genuinely want to include “NA” as text, use str_replace_na() first to make that intention explicit.

str_c() with Vectors

Where str_c() shines is vectorized operations. When you pass vectors, it performs element-wise concatenation:

first_names <- c("Alice", "Bob", "Carol")
last_names <- c("Smith", "Jones", "Williams")

str_c(first_names, last_names, sep = " ")
#> [1] "Alice Smith"   "Bob Jones"     "Carol Williams"

The collapse parameter transforms a vector result into a single string:

# Without collapse: returns vector
str_c("item", 1:3, sep = "_")
#> [1] "item_1" "item_2" "item_3"

# With collapse: returns single string
str_c("item", 1:3, sep = "_", collapse = ", ")
#> [1] "item_1, item_2, item_3"

# Practical example: building a SQL IN clause
ids <- c(101, 102, 103)
str_c("(", str_c(ids, collapse = ", "), ")")
#> [1] "(101, 102, 103)"

Recycling rules apply when vectors have different lengths. Shorter vectors recycle to match the longest:

str_c("prefix", c("a", "b", "c"), "suffix", sep = "_")
#> [1] "prefix_a_suffix" "prefix_b_suffix" "prefix_c_suffix"

str_glue() Basics

While str_c() excels at programmatic concatenation, str_glue() provides a more readable approach for template-based strings. If you’ve used Python f-strings or JavaScript template literals, the syntax will feel familiar:

name <- "Alice"
age <- 30

str_glue("My name is {name} and I am {age} years old.")
#> My name is Alice and I am 30 years old.

Variables inside curly braces are evaluated and inserted into the string. This eliminates the visual noise of multiple str_c() calls and quoted fragments:

# Compare readability
product <- "Widget"
price <- 29.99
quantity <- 5

# Using str_c()
str_c("Order: ", quantity, "x ", product, " @ $", price, " each")
#> [1] "Order: 5x Widget @ $29.99 each"

# Using str_glue()
str_glue("Order: {quantity}x {product} @ ${price} each")
#> Order: 5x Widget @ $29.99 each

The str_glue() version reads almost like the final output, making it easier to verify correctness at a glance.

Advanced str_glue() Features

The curly braces in str_glue() accept any R expression, not just variable names:

str_glue("Total: ${price * quantity}")
#> Total: $149.95

str_glue("Uppercase: {toupper(product)}")
#> Uppercase: WIDGET

str_glue("Today is {format(Sys.Date(), '%B %d, %Y')}")
#> Today is January 15, 2024

For working with data frames, str_glue_data() evaluates expressions in the context of the data:

library(tibble)

orders <- tibble(
  customer = c("Alice", "Bob", "Carol"),
  product = c("Widget", "Gadget", "Gizmo"),
  quantity = c(5, 3, 8),
  price = c(29.99, 49.99, 19.99)
)

str_glue_data(orders, "{customer} ordered {quantity} {product}(s)")
#> Alice ordered 5 Widget(s)
#> Bob ordered 3 Gadget(s)
#> Carol ordered 8 Gizmo(s)

This integrates seamlessly with dplyr::mutate():

library(dplyr)

orders |>
  mutate(
    summary = str_glue("{customer}: {quantity}x {product} = ${quantity * price}")
  )
#> # A tibble: 3 × 5
#>   customer product quantity price summary                    
#>   <chr>    <chr>      <dbl> <dbl> <glue>                     
#> 1 Alice    Widget         5 29.99 Alice: 5x Widget = $149.95
#> 2 Bob      Gadget         3 49.99 Bob: 3x Gadget = $149.97
#> 3 Carol    Gizmo          8 19.99 Carol: 8x Gizmo = $159.92

To include literal curly braces in output, double them:

str_glue("Use {{variable}} syntax in str_glue()")
#> Use {variable} syntax in str_glue()

str_c() vs str_glue() Comparison

Both functions can produce identical output, but they serve different purposes:

# Same result, different approaches
first <- "John"
last <- "Doe"
email_domain <- "example.com"

# str_c(): good for programmatic construction
str_c(tolower(first), ".", tolower(last), "@", email_domain)
#> [1] "john.doe@example.com"

# str_glue(): good for readable templates
str_glue("{tolower(first)}.{tolower(last)}@{email_domain}")
#> john.doe@example.com

Use str_c() when:

  • Concatenating vectors element-wise
  • Building strings programmatically in loops or functions
  • You need the collapse parameter to reduce vectors
  • Performance is critical (marginally faster for simple cases)

Use str_glue() when:

  • Creating human-readable message templates
  • The string structure should be visually apparent
  • Embedding multiple expressions in a complex template
  • Working with data frames via str_glue_data()

Performance differences are negligible for most applications. Choose based on readability and intent.

Practical Applications

Here are real-world patterns you’ll use repeatedly.

Building file paths:

base_dir <- "/data/exports"
year <- 2024
month <- "01"
file_type <- "csv"

# Dynamic path construction
path <- str_glue("{base_dir}/{year}/{month}/report.{file_type}")
#> /data/exports/2024/01/report.csv

# Batch file paths
files <- str_c(base_dir, "/", year, "/data_", 1:12, ".csv")
#> [1] "/data/exports/2024/data_1.csv"  "/data/exports/2024/data_2.csv" ...

Generating SQL queries:

table_name <- "users"
columns <- c("id", "name", "email")
where_ids <- c(1, 2, 3)

query <- str_glue(
  "SELECT {str_c(columns, collapse = ', ')}
   FROM {table_name}
   WHERE id IN ({str_c(where_ids, collapse = ', ')})"
)
#> SELECT id, name, email
#>    FROM users
#>    WHERE id IN (1, 2, 3)

Creating log messages:

log_message <- function(level, component, message) {
  timestamp <- format(Sys.time(), "%Y-%m-%d %H:%M:%S")
  str_glue("[{timestamp}] [{toupper(level)}] [{component}] {message}")
}

log_message("info", "data_loader", "Processing started")
#> [2024-01-15 10:30:45] [INFO] [data_loader] Processing started

Generating reports from data:

summary_stats <- tibble(
  metric = c("Total Sales", "Average Order", "Customer Count"),
  value = c(150000, 75.50, 2000),
  change = c(0.12, -0.03, 0.08)
)

summary_stats |>
  mutate(
    report_line = str_glue(
      "{metric}: {scales::dollar(value)} ({scales::percent(change, accuracy = 0.1)} vs last period)"
    )
  ) |>
  pull(report_line) |>
  str_c(collapse = "\n") |>
  cat()
#> Total Sales: $150,000 (12.0% vs last period)
#> Average Order: $76 (-3.0% vs last period)
#> Customer Count: $2,000 (8.0% vs last period)

The stringr concatenation functions eliminate entire categories of bugs while making your code more readable. The NA propagation in str_c() catches data quality issues early. The template syntax in str_glue() makes string construction self-documenting. Add them to your standard toolkit and stop wrestling with paste().

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.