R dplyr - relocate() - Reorder Columns

The `relocate()` function from dplyr moves columns to new positions within a data frame. By default, it moves specified columns to the leftmost position.

Key Insights

  • relocate() provides intuitive column reordering with helper functions like before, after, where(), and selection helpers (starts_with(), contains(), etc.)
  • Moving columns to the front or end requires just the column names; positioning relative to other columns uses .before and .after arguments
  • Conditional relocations with where() enable type-based or property-based column reorganization, perfect for standardizing data frame structures

Basic Column Relocation

The relocate() function from dplyr moves columns to new positions within a data frame. By default, it moves specified columns to the leftmost position.

library(dplyr)

# Sample data frame
df <- data.frame(
  id = 1:5,
  name = c("Alice", "Bob", "Charlie", "David", "Eve"),
  age = c(25, 30, 35, 28, 32),
  salary = c(50000, 60000, 55000, 52000, 58000),
  department = c("Sales", "IT", "HR", "Sales", "IT")
)

# Move department column to the front
df %>% relocate(department)

Output:

  department id   name age salary
1      Sales  1  Alice  25  50000
2         IT  2    Bob  30  60000
3         HR  3 Charlie  35  55000
4      Sales  4  David  28  52000
5         IT  5    Eve  32  58000

Multiple columns can be relocated simultaneously, maintaining their relative order:

# Move both name and department to the front
df %>% relocate(name, department)

Using .before and .after Arguments

Position columns precisely using .before and .after arguments:

# Place salary before age
df %>% relocate(salary, .before = age)

# Place department after name
df %>% relocate(department, .after = name)

# Place id at the end using .after with last()
df %>% relocate(id, .after = last_col())

The .after = last_col() pattern effectively moves columns to the rightmost position:

# Move multiple columns to the end
df %>% relocate(id, name, .after = last_col())

Output:

  age salary department id   name
1  25  50000      Sales  1  Alice
2  30  60000         IT  2    Bob
3  35  55000         HR  3 Charlie
4  28  52000      Sales  4  David
5  32  58000         IT  5    Eve

Selection Helpers with relocate()

Combine relocate() with tidyselect helpers for pattern-based column reordering:

# Create a more complex data frame
employee_data <- data.frame(
  emp_id = 1:4,
  emp_name = c("John", "Jane", "Mike", "Sarah"),
  emp_age = c(28, 32, 45, 29),
  dept_code = c("D01", "D02", "D01", "D03"),
  dept_name = c("Engineering", "Marketing", "Engineering", "Sales"),
  salary_base = c(70000, 75000, 90000, 72000),
  salary_bonus = c(5000, 6000, 10000, 5500)
)

# Move all columns starting with "dept_" to the front
employee_data %>% relocate(starts_with("dept_"))

# Move all salary columns to the end
employee_data %>% relocate(starts_with("salary_"), .after = last_col())

# Move all columns containing "name" after emp_id
employee_data %>% relocate(contains("name"), .after = emp_id)

Other useful selection helpers include:

# ends_with() - columns ending with a pattern
employee_data %>% relocate(ends_with("_code"))

# matches() - columns matching a regex
employee_data %>% relocate(matches("^emp_"))

# num_range() - numbered columns
data_with_nums <- data.frame(x1 = 1:3, x2 = 4:6, x3 = 7:9, y = 10:12)
data_with_nums %>% relocate(num_range("x", 2:3))

Conditional Relocation with where()

The where() function enables type-based or condition-based column relocation:

# Move all numeric columns to the front
employee_data %>% relocate(where(is.numeric))

# Move all character columns to the end
employee_data %>% relocate(where(is.character), .after = last_col())

# Move all factor columns before a specific column
df_with_factors <- df %>% 
  mutate(department = as.factor(department))

df_with_factors %>% relocate(where(is.factor), .before = age)

Custom predicates work with where():

# Move columns with all values > 1000 to the end
employee_data %>% 
  relocate(where(~ is.numeric(.) && all(. > 1000, na.rm = TRUE)), 
           .after = last_col())

# Move columns with any NA values to the front
data_with_na <- employee_data
data_with_na$salary_bonus[2] <- NA

data_with_na %>% 
  relocate(where(~ any(is.na(.))))

Combining Multiple Relocate Operations

Chain multiple relocate() calls or use complex selections:

employee_data %>%
  relocate(emp_id) %>%                          # ID first
  relocate(starts_with("emp_"), .before = everything()) %>%  # All emp_ columns next
  relocate(where(is.numeric), .after = last_col())           # Numeric columns last

A more efficient single-call approach:

employee_data %>%
  relocate(
    emp_id,                           # First
    emp_name, emp_age,                # Second and third
    starts_with("dept_"),             # Department columns
    .before = where(is.numeric) & starts_with("salary_")  # Before salary columns
  )

Practical Use Cases

Standardizing Data Frame Structure

Enforce consistent column ordering across multiple data frames:

standardize_columns <- function(df) {
  df %>%
    relocate(any_of(c("id", "name", "date"))) %>%
    relocate(where(is.numeric), .after = last_col())
}

# Apply to multiple data frames
list_of_dfs <- list(df1, df2, df3)
standardized <- lapply(list_of_dfs, standardize_columns)

Preparing Data for Display

Reorder columns for better readability in reports:

# Move summary columns to the front, details to the back
report_data <- employee_data %>%
  relocate(emp_name, dept_name) %>%
  relocate(starts_with("salary_"), .after = last_col())

Organize columns by logical groupings:

survey_data <- data.frame(
  respondent_id = 1:3,
  q1_rating = c(4, 5, 3),
  q1_comment = c("Good", "Great", "OK"),
  q2_rating = c(3, 4, 5),
  q2_comment = c("Fair", "Good", "Excellent"),
  age = c(25, 30, 35),
  gender = c("F", "M", "F")
)

# Group demographic columns at the start, then questions
survey_data %>%
  relocate(respondent_id, age, gender) %>%
  relocate(contains("q1_"), .after = gender) %>%
  relocate(contains("q2_"), .after = contains("q1_"))

Performance Considerations

relocate() creates a new data frame with reordered columns. For large datasets, minimize relocations:

# Less efficient - multiple relocations
df %>%
  relocate(col1) %>%
  relocate(col2, .after = col1) %>%
  relocate(col3, .after = col2)

# More efficient - single relocation
df %>%
  relocate(col1, col2, col3)

For programmatic column reordering with variable column names:

priority_cols <- c("emp_id", "emp_name")
other_cols <- setdiff(names(employee_data), priority_cols)

employee_data %>%
  relocate(all_of(priority_cols))

The all_of() and any_of() functions handle column name vectors, with any_of() ignoring non-existent columns without errors—useful for defensive programming across varying data structures.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.