R - ifelse() Function with Examples

• The `ifelse()` function provides vectorized conditional logic, evaluating conditions element-wise across vectors and returning values based on TRUE/FALSE results

Key Insights

• The ifelse() function provides vectorized conditional logic, evaluating conditions element-wise across vectors and returning values based on TRUE/FALSE results • Unlike standard if-else statements, ifelse() operates on entire vectors simultaneously, making it essential for data manipulation in data frames and matrices • Understanding ifelse() recycling rules, nested operations, and performance characteristics prevents common pitfalls in production R code

Understanding ifelse() Syntax

The ifelse() function takes three arguments: a test condition, a value to return when TRUE, and a value to return when FALSE. The syntax is:

ifelse(test, yes, no)

Here’s a basic example:

x <- c(1, 5, 10, 15, 20)
result <- ifelse(x > 10, "high", "low")
print(result)
# [1] "low"  "low"  "low"  "high" "high"

The function evaluates each element in x, returning “high” when the condition is TRUE and “low” when FALSE. This vectorized approach eliminates the need for explicit loops.

Vectorized Operations on Data Frames

The primary use case for ifelse() is creating new columns in data frames based on existing column values:

# Sample sales data
sales <- data.frame(
  product = c("A", "B", "C", "D", "E"),
  revenue = c(15000, 8000, 25000, 12000, 30000),
  units = c(150, 80, 200, 120, 250)
)

# Categorize revenue performance
sales$performance <- ifelse(sales$revenue > 20000, "Excellent", "Standard")

# Calculate bonus eligibility
sales$bonus_eligible <- ifelse(sales$units >= 150, TRUE, FALSE)

print(sales)
#   product revenue units performance bonus_eligible
# 1       A   15000   150    Standard           TRUE
# 2       B    8000    80    Standard          FALSE
# 3       C   25000   200   Excellent           TRUE
# 4       D   12000   120    Standard          FALSE
# 5       E   30000   250   Excellent           TRUE

Nested ifelse() for Multiple Conditions

For multi-tier categorization, nest ifelse() calls:

# Temperature classification
temps <- c(15, 25, 35, 5, 42, 18, 30)

classification <- ifelse(temps < 10, "Cold",
                  ifelse(temps < 20, "Cool",
                  ifelse(temps < 30, "Warm", "Hot")))

print(data.frame(temperature = temps, class = classification))
#   temperature class
# 1          15  Cool
# 2          25  Warm
# 3          35   Hot
# 4           5  Cold
# 5          42   Hot
# 6          18  Cool
# 7          30   Hot

While functional, deeply nested ifelse() becomes difficult to read. For complex logic, consider dplyr::case_when():

library(dplyr)

temps_df <- data.frame(temp = temps)
temps_df$class <- case_when(
  temps_df$temp < 10 ~ "Cold",
  temps_df$temp < 20 ~ "Cool",
  temps_df$temp < 30 ~ "Warm",
  TRUE ~ "Hot"
)

Handling NA Values

ifelse() propagates NA values by default:

values <- c(5, 10, NA, 20, 15)
result <- ifelse(values > 12, "high", "low")
print(result)
# [1] "low"  "low"  NA     "high" "high"

To handle NAs explicitly, add a condition:

result <- ifelse(is.na(values), "missing",
          ifelse(values > 12, "high", "low"))
print(result)
# [1] "low"     "low"     "missing" "high"    "high"

Working with Multiple Columns

Combine multiple conditions using logical operators:

employees <- data.frame(
  name = c("Alice", "Bob", "Carol", "Dave", "Eve"),
  salary = c(75000, 55000, 95000, 62000, 88000),
  tenure = c(5, 2, 8, 3, 6)
)

# Senior employees: high salary AND long tenure
employees$level <- ifelse(employees$salary > 70000 & employees$tenure >= 5, 
                          "Senior", "Junior")

print(employees)
#    name salary tenure  level
# 1 Alice  75000      5 Senior
# 2   Bob  55000      2 Junior
# 3 Carol  95000      8 Senior
# 4  Dave  62000      3 Junior
# 5   Eve  88000      6 Senior

Numeric Calculations with ifelse()

Use ifelse() for conditional calculations:

# Apply discount based on order size
orders <- data.frame(
  order_id = 1:5,
  amount = c(500, 1500, 800, 2500, 1200)
)

# 15% discount for orders over 1000, otherwise 5%
orders$discount <- ifelse(orders$amount > 1000, 
                          orders$amount * 0.15, 
                          orders$amount * 0.05)

orders$final_amount <- orders$amount - orders$discount

print(orders)
#   order_id amount discount final_amount
# 1        1    500     25.0        475.0
# 2        2   1500    225.0       1275.0
# 3        3    800     40.0        760.0
# 4        4   2500    375.0       2125.0
# 5        5   1200    180.0       1020.0

Vector Recycling Behavior

ifelse() recycles shorter vectors to match longer ones:

# Single condition value recycled
x <- 1:10
result <- ifelse(x %% 2 == 0, "even", "odd")
print(result)
#  [1] "odd"  "even" "odd"  "even" "odd"  "even" "odd"  "even" "odd"  "even"

# Recycling with different length vectors
values <- 1:6
thresholds <- c(3, 5)  # Will recycle: 3, 5, 3, 5, 3, 5
result <- ifelse(values > thresholds, "above", "below")
print(result)
# [1] "below" "below" "above" "below" "above" "above"

Be cautious with recycling—it can produce unexpected results if vectors don’t align properly.

Performance Considerations

For large datasets, ifelse() can be slower than alternatives:

# Benchmark different approaches
library(microbenchmark)

n <- 1e6
x <- runif(n, 0, 100)

microbenchmark(
  ifelse_method = ifelse(x > 50, "high", "low"),
  bracket_method = {
    result <- character(n)
    result[x > 50] <- "high"
    result[x <= 50] <- "low"
    result
  },
  times = 100
)

The bracket subsetting method often outperforms ifelse() for simple binary conditions on large vectors. However, ifelse() remains more readable and sufficient for most data analysis tasks.

Type Coercion Gotchas

ifelse() returns a vector with a single type, coercing values as needed:

# Numeric and character mix
result <- ifelse(c(TRUE, FALSE), 100, "low")
print(result)
# [1] "100" "low"  # Both converted to character

# Better: keep types consistent
result <- ifelse(c(TRUE, FALSE), 100, 0)
print(result)
# [1] 100   0  # Both numeric

When mixing types, the result follows R’s coercion hierarchy: logical < integer < numeric < character.

Practical Example: Data Cleaning Pipeline

Here’s a realistic data cleaning scenario combining multiple ifelse() operations:

# Raw customer data with issues
customers <- data.frame(
  id = 1:6,
  age = c(25, -5, 150, 45, NA, 32),
  income = c(50000, 75000, 0, 120000, 85000, NA),
  status = c("active", "ACTIVE", "inactive", "Active", "pending", "active")
)

# Clean age: flag invalid values
customers$age_clean <- ifelse(is.na(customers$age) | customers$age < 0 | customers$age > 120,
                              NA, customers$age)

# Clean income: replace 0 and NA with median
median_income <- median(customers$income[customers$income > 0], na.rm = TRUE)
customers$income_clean <- ifelse(is.na(customers$income) | customers$income == 0,
                                 median_income, customers$income)

# Standardize status
customers$status_clean <- ifelse(tolower(customers$status) == "active", 
                                "Active", "Inactive")

# Create customer segment
customers$segment <- ifelse(customers$income_clean > 100000, "Premium",
                     ifelse(customers$income_clean > 60000, "Standard", "Basic"))

print(customers[, c("id", "age_clean", "income_clean", "status_clean", "segment")])
#   id age_clean income_clean status_clean  segment
# 1  1        25        50000       Active    Basic
# 2  2        NA        75000       Active Standard
# 3  3        NA        80000     Inactive Standard
# 4  4        45       120000       Active  Premium
# 5  5        NA        85000     Inactive Standard
# 6  6        32        80000       Active Standard

This example demonstrates how ifelse() handles real-world data quality issues efficiently, making it an essential tool for data preprocessing in R workflows.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.