R dplyr - relocate() - Reorder Columns
The `relocate()` function from dplyr moves columns to new positions within a data frame. By default, it moves specified columns to the leftmost position.
Key Insights
relocate()provides intuitive column reordering with helper functions likebefore,after,where(), and selection helpers (starts_with(),contains(), etc.)- Moving columns to the front or end requires just the column names; positioning relative to other columns uses
.beforeand.afterarguments - Conditional relocations with
where()enable type-based or property-based column reorganization, perfect for standardizing data frame structures
Basic Column Relocation
The relocate() function from dplyr moves columns to new positions within a data frame. By default, it moves specified columns to the leftmost position.
library(dplyr)
# Sample data frame
df <- data.frame(
id = 1:5,
name = c("Alice", "Bob", "Charlie", "David", "Eve"),
age = c(25, 30, 35, 28, 32),
salary = c(50000, 60000, 55000, 52000, 58000),
department = c("Sales", "IT", "HR", "Sales", "IT")
)
# Move department column to the front
df %>% relocate(department)
Output:
department id name age salary
1 Sales 1 Alice 25 50000
2 IT 2 Bob 30 60000
3 HR 3 Charlie 35 55000
4 Sales 4 David 28 52000
5 IT 5 Eve 32 58000
Multiple columns can be relocated simultaneously, maintaining their relative order:
# Move both name and department to the front
df %>% relocate(name, department)
Using .before and .after Arguments
Position columns precisely using .before and .after arguments:
# Place salary before age
df %>% relocate(salary, .before = age)
# Place department after name
df %>% relocate(department, .after = name)
# Place id at the end using .after with last()
df %>% relocate(id, .after = last_col())
The .after = last_col() pattern effectively moves columns to the rightmost position:
# Move multiple columns to the end
df %>% relocate(id, name, .after = last_col())
Output:
age salary department id name
1 25 50000 Sales 1 Alice
2 30 60000 IT 2 Bob
3 35 55000 HR 3 Charlie
4 28 52000 Sales 4 David
5 32 58000 IT 5 Eve
Selection Helpers with relocate()
Combine relocate() with tidyselect helpers for pattern-based column reordering:
# Create a more complex data frame
employee_data <- data.frame(
emp_id = 1:4,
emp_name = c("John", "Jane", "Mike", "Sarah"),
emp_age = c(28, 32, 45, 29),
dept_code = c("D01", "D02", "D01", "D03"),
dept_name = c("Engineering", "Marketing", "Engineering", "Sales"),
salary_base = c(70000, 75000, 90000, 72000),
salary_bonus = c(5000, 6000, 10000, 5500)
)
# Move all columns starting with "dept_" to the front
employee_data %>% relocate(starts_with("dept_"))
# Move all salary columns to the end
employee_data %>% relocate(starts_with("salary_"), .after = last_col())
# Move all columns containing "name" after emp_id
employee_data %>% relocate(contains("name"), .after = emp_id)
Other useful selection helpers include:
# ends_with() - columns ending with a pattern
employee_data %>% relocate(ends_with("_code"))
# matches() - columns matching a regex
employee_data %>% relocate(matches("^emp_"))
# num_range() - numbered columns
data_with_nums <- data.frame(x1 = 1:3, x2 = 4:6, x3 = 7:9, y = 10:12)
data_with_nums %>% relocate(num_range("x", 2:3))
Conditional Relocation with where()
The where() function enables type-based or condition-based column relocation:
# Move all numeric columns to the front
employee_data %>% relocate(where(is.numeric))
# Move all character columns to the end
employee_data %>% relocate(where(is.character), .after = last_col())
# Move all factor columns before a specific column
df_with_factors <- df %>%
mutate(department = as.factor(department))
df_with_factors %>% relocate(where(is.factor), .before = age)
Custom predicates work with where():
# Move columns with all values > 1000 to the end
employee_data %>%
relocate(where(~ is.numeric(.) && all(. > 1000, na.rm = TRUE)),
.after = last_col())
# Move columns with any NA values to the front
data_with_na <- employee_data
data_with_na$salary_bonus[2] <- NA
data_with_na %>%
relocate(where(~ any(is.na(.))))
Combining Multiple Relocate Operations
Chain multiple relocate() calls or use complex selections:
employee_data %>%
relocate(emp_id) %>% # ID first
relocate(starts_with("emp_"), .before = everything()) %>% # All emp_ columns next
relocate(where(is.numeric), .after = last_col()) # Numeric columns last
A more efficient single-call approach:
employee_data %>%
relocate(
emp_id, # First
emp_name, emp_age, # Second and third
starts_with("dept_"), # Department columns
.before = where(is.numeric) & starts_with("salary_") # Before salary columns
)
Practical Use Cases
Standardizing Data Frame Structure
Enforce consistent column ordering across multiple data frames:
standardize_columns <- function(df) {
df %>%
relocate(any_of(c("id", "name", "date"))) %>%
relocate(where(is.numeric), .after = last_col())
}
# Apply to multiple data frames
list_of_dfs <- list(df1, df2, df3)
standardized <- lapply(list_of_dfs, standardize_columns)
Preparing Data for Display
Reorder columns for better readability in reports:
# Move summary columns to the front, details to the back
report_data <- employee_data %>%
relocate(emp_name, dept_name) %>%
relocate(starts_with("salary_"), .after = last_col())
Grouping Related Columns
Organize columns by logical groupings:
survey_data <- data.frame(
respondent_id = 1:3,
q1_rating = c(4, 5, 3),
q1_comment = c("Good", "Great", "OK"),
q2_rating = c(3, 4, 5),
q2_comment = c("Fair", "Good", "Excellent"),
age = c(25, 30, 35),
gender = c("F", "M", "F")
)
# Group demographic columns at the start, then questions
survey_data %>%
relocate(respondent_id, age, gender) %>%
relocate(contains("q1_"), .after = gender) %>%
relocate(contains("q2_"), .after = contains("q1_"))
Performance Considerations
relocate() creates a new data frame with reordered columns. For large datasets, minimize relocations:
# Less efficient - multiple relocations
df %>%
relocate(col1) %>%
relocate(col2, .after = col1) %>%
relocate(col3, .after = col2)
# More efficient - single relocation
df %>%
relocate(col1, col2, col3)
For programmatic column reordering with variable column names:
priority_cols <- c("emp_id", "emp_name")
other_cols <- setdiff(names(employee_data), priority_cols)
employee_data %>%
relocate(all_of(priority_cols))
The all_of() and any_of() functions handle column name vectors, with any_of() ignoring non-existent columns without errors—useful for defensive programming across varying data structures.