R dplyr - rename() Columns | Application Architect

Key Insights

The rename() function provides a clean, intuitive syntax for renaming columns using new_name = old_name format, making it more readable than base R alternatives
rename_with() enables programmatic column renaming using functions, supporting complex transformations like case conversion, prefix/suffix addition, and pattern-based modifications
Selection helpers (starts_with(), contains(), matches()) combined with rename_with() allow targeted column renaming based on patterns rather than explicit column names

Basic Column Renaming with rename()

The rename() function from dplyr uses a straightforward syntax where you specify the new name on the left and the old name on the right. This reversed assignment feels natural when reading code aloud.

library(dplyr)

# Sample dataset
employees <- data.frame(
  emp_id = 1:5,
  first_nm = c("John", "Sarah", "Mike", "Lisa", "Tom"),
  dept_cd = c("ENG", "HR", "ENG", "FIN", "HR"),
  sal_amt = c(75000, 65000, 80000, 70000, 60000)
)

# Rename single column
employees_clean <- employees %>%
  rename(employee_id = emp_id)

# Rename multiple columns
employees_clean <- employees %>%
  rename(
    employee_id = emp_id,
    first_name = first_nm,
    department_code = dept_cd,
    salary_amount = sal_amt
  )

print(employees_clean)

The function preserves column order and data types. Unlike base R’s names() assignment, you don’t need to reference columns by position or worry about maintaining the entire names vector.

Comparing rename() to Base R Approaches

Base R offers several methods for renaming columns, but they’re more verbose and error-prone.

# Base R approach 1: Direct assignment
names(employees)[names(employees) == "emp_id"] <- "employee_id"

# Base R approach 2: Position-based
names(employees)[1] <- "employee_id"

# Base R approach 3: Complete reassignment
names(employees) <- c("employee_id", "first_name", "department_code", "salary_amount")

# dplyr approach: clearer intent
employees %>%
  rename(employee_id = emp_id)

The dplyr approach wins on readability and safety. You explicitly state which column you’re renaming, reducing the risk of accidentally renaming the wrong column when your dataset structure changes.

Programmatic Renaming with rename_with()

When you need to apply transformations to column names systematically, rename_with() accepts a function that processes each column name.

# Convert all columns to uppercase
employees %>%
  rename_with(toupper)

# Convert to lowercase
employees %>%
  rename_with(tolower)

# Custom function to clean names
clean_column_names <- function(x) {
  x %>%
    tolower() %>%
    gsub("_nm$", "_name", .) %>%
    gsub("_cd$", "_code", .) %>%
    gsub("_amt$", "_amount", .)
}

employees_cleaned <- employees %>%
  rename_with(clean_column_names)

print(employees_cleaned)

This approach shines when dealing with datasets that follow naming conventions. Instead of manually renaming dozens of columns, you define the transformation logic once.

Selective Renaming with Selection Helpers

Selection helpers let you target specific columns based on patterns, positions, or characteristics.

# Rename columns starting with specific prefix
sales_data <- data.frame(
  id = 1:3,
  q1_revenue = c(100, 200, 150),
  q1_costs = c(50, 80, 60),
  q2_revenue = c(120, 210, 160),
  q2_costs = c(55, 85, 65)
)

# Add prefix to all quarter columns
sales_data %>%
  rename_with(~paste0("fy2024_", .), starts_with("q"))

# Convert specific columns to uppercase
sales_data %>%
  rename_with(toupper, starts_with("q1"))

# Rename columns containing a pattern
sales_data %>%
  rename_with(~gsub("revenue", "sales", .), contains("revenue"))

# Use matches() for regex patterns
sales_data %>%
  rename_with(~gsub("q([0-9])", "quarter_\\1", .), matches("^q[0-9]"))

Common selection helpers include:

starts_with(): Columns starting with a prefix
ends_with(): Columns ending with a suffix
contains(): Columns containing a string
matches(): Columns matching a regex pattern
where(): Columns satisfying a condition

Combining rename() with Anonymous Functions

For complex transformations on specific columns, combine rename_with() with anonymous functions using the ~ syntax.

# Remove common prefix from specific columns
metrics <- data.frame(
  metric_sales = c(100, 200),
  metric_profit = c(20, 40),
  metric_margin = c(0.2, 0.2),
  region = c("North", "South")
)

# Remove "metric_" prefix
metrics %>%
  rename_with(~sub("metric_", "", .), starts_with("metric"))

# Add suffix to numeric columns
metrics %>%
  rename_with(~paste0(., "_value"), where(is.numeric))

# Complex transformation: snake_case to camelCase
to_camel_case <- function(x) {
  gsub("_([a-z])", "\\U\\1", x, perl = TRUE)
}

metrics %>%
  rename_with(to_camel_case)

Handling Duplicate Column Names

When renaming creates duplicates, dplyr throws an error by default. Handle this explicitly in your transformation logic.

# This will error
tryCatch({
  employees %>%
    rename(name = first_nm, name = dept_cd)
}, error = function(e) {
  print(paste("Error:", e$message))
})

# Proper approach: use unique suffixes
employees %>%
  rename(
    first_name = first_nm,
    dept_name = dept_cd
  )

# Or use make.unique() in rename_with()
duplicate_cols <- data.frame(
  value = 1:3,
  value = 4:6,
  value = 7:9,
  check.names = FALSE
)

duplicate_cols %>%
  rename_with(make.unique)

Renaming in Pipeline Workflows

The rename() function integrates seamlessly into dplyr pipelines, allowing column renaming alongside filtering, mutation, and aggregation.

# Complete data processing pipeline
results <- employees %>%
  rename(
    id = emp_id,
    name = first_nm,
    department = dept_cd,
    salary = sal_amt
  ) %>%
  filter(salary > 65000) %>%
  mutate(
    salary_category = case_when(
      salary >= 75000 ~ "High",
      salary >= 65000 ~ "Medium",
      TRUE ~ "Low"
    )
  ) %>%
  group_by(department) %>%
  summarize(
    avg_salary = mean(salary),
    count = n()
  )

print(results)

Renaming with External Mapping Tables

For datasets requiring standardized column names, maintain a mapping table and apply it programmatically.

# Column mapping table
column_mapping <- data.frame(
  old_name = c("emp_id", "first_nm", "dept_cd", "sal_amt"),
  new_name = c("employee_id", "first_name", "department", "salary"),
  stringsAsFactors = FALSE
)

# Function to apply mapping
apply_column_mapping <- function(df, mapping) {
  for(i in 1:nrow(mapping)) {
    old <- mapping$old_name[i]
    new <- mapping$new_name[i]
    if(old %in% names(df)) {
      df <- df %>% rename(!!new := !!old)
    }
  }
  df
}

employees_standardized <- apply_column_mapping(employees, column_mapping)
print(employees_standardized)

# Alternative using setNames
employees %>%
  rename_with(~{
    idx <- match(., column_mapping$old_name)
    ifelse(is.na(idx), ., column_mapping$new_name[idx])
  })

This pattern works well when integrating data from multiple sources that need consistent column naming conventions. Store your mappings in CSV files or databases for reusability across projects.