R - If/Else/Else If Statements | Application Architect

Key Insights

R’s if/else statements use braces and support vectorized conditions through ifelse() and dplyr::case_when() for efficient data manipulation
The else if ladder evaluates conditions sequentially, stopping at the first TRUE match, making condition order critical for correct logic
Vectorized alternatives like ifelse() and case_when() dramatically outperform loops for data frame operations, often by 100x or more

Basic If/Else Syntax

R’s conditional statements follow a straightforward structure. Unlike vectorized languages where conditions apply element-wise by default, R’s base if statement evaluates a single logical value.

temperature <- 75

if (temperature > 80) {
  print("It's hot outside")
} else if (temperature > 60) {
  print("Pleasant weather")
} else {
  print("It's cold")
}
# Output: "Pleasant weather"

The condition must evaluate to a single TRUE or FALSE. Passing a vector triggers a warning and uses only the first element:

temps <- c(65, 85, 55)

if (temps > 70) {  # Warning: only using first element
  print("Hot")
}
# Warning message: the condition has length > 1

Single-Line Conditionals

For simple assignments, omit braces for cleaner code:

score <- 85
grade <- if (score >= 90) "A" else if (score >= 80) "B" else "C"
print(grade)  # "B"

# Inline assignment
status <- if (score >= 60) "Pass" else "Fail"

This pattern works well for configuration logic or simple transformations where readability isn’t compromised.

Vectorized Conditionals with ifelse()

When working with vectors or data frame columns, ifelse() applies conditions element-wise:

temperatures <- c(55, 75, 95, 62, 88)
conditions <- ifelse(temperatures > 80, "Hot", "Moderate")
print(conditions)
# [1] "Moderate" "Moderate" "Hot"      "Moderate" "Hot"

Nested ifelse() handles multiple conditions but becomes unwieldy:

scores <- c(92, 78, 85, 65, 58)
grades <- ifelse(scores >= 90, "A",
                 ifelse(scores >= 80, "B",
                        ifelse(scores >= 70, "C",
                               ifelse(scores >= 60, "D", "F"))))
print(grades)
# [1] "A" "C" "B" "D" "F"

Performance note: ifelse() evaluates both the true and false expressions for all elements, which can be inefficient with expensive computations.

Multi-Condition Logic with case_when()

The dplyr::case_when() function provides cleaner syntax for complex conditional logic:

library(dplyr)

scores <- c(92, 78, 85, 65, 58, 95, 72)
grades <- case_when(
  scores >= 90 ~ "A",
  scores >= 80 ~ "B",
  scores >= 70 ~ "C",
  scores >= 60 ~ "D",
  TRUE ~ "F"  # Default case
)
print(grades)
# [1] "A" "C" "B" "D" "F" "A" "C"

Advantages over nested ifelse():

Evaluates conditions sequentially, stopping at first match
More readable for multiple conditions
Type-safe: all outputs must be compatible types
Works seamlessly in mutate() pipelines

library(dplyr)

df <- data.frame(
  product = c("Widget", "Gadget", "Tool", "Device"),
  price = c(25, 150, 75, 300),
  quantity = c(100, 20, 50, 5)
)

df <- df %>%
  mutate(
    price_category = case_when(
      price < 50 ~ "Budget",
      price < 100 ~ "Mid-range",
      price < 200 ~ "Premium",
      TRUE ~ "Luxury"
    ),
    stock_status = case_when(
      quantity == 0 ~ "Out of Stock",
      quantity < 10 ~ "Low Stock",
      quantity < 50 ~ "Available",
      TRUE ~ "In Stock"
    )
  )

print(df)
#   product price quantity price_category stock_status
# 1  Widget    25      100         Budget     In Stock
# 2  Gadget   150       20        Premium    Available
# 3    Tool    75       50      Mid-range     In Stock
# 4  Device   300        5         Luxury    Low Stock

Logical Operators in Conditions

Combine conditions using logical operators:

age <- 25
income <- 55000

if (age >= 18 && income > 50000) {
  print("Eligible for premium account")
} else if (age >= 18 || income > 40000) {
  print("Eligible for standard account")
} else {
  print("Basic account only")
}
# Output: "Eligible for premium account"

Important distinction: Use && and || for scalar conditionals (short-circuit evaluation), & and | for vectorized operations:

# Scalar (if statements)
x <- 5
if (x > 3 && x < 10) print("In range")  # Correct

# Vectorized (data operations)
values <- c(2, 5, 8, 12)
in_range <- values > 3 & values < 10
print(in_range)
# [1] FALSE  TRUE  TRUE FALSE

Handling NULL and NA Values

Conditional statements with NULL or NA require explicit handling:

value <- NA

# This doesn't work as expected
if (value > 10) {
  print("Large")
} else {
  print("Small")
}
# Error: missing value where TRUE/FALSE needed

# Correct approach
if (is.na(value)) {
  print("Missing value")
} else if (value > 10) {
  print("Large")
} else {
  print("Small")
}
# Output: "Missing value"

For vectorized operations, use na.rm or explicit NA handling:

values <- c(5, NA, 15, 8, NA)

# ifelse preserves NAs
result <- ifelse(values > 10, "High", "Low")
print(result)
# [1] "Low" NA    "High" "Low" NA

# case_when with explicit NA handling
result <- case_when(
  is.na(values) ~ "Unknown",
  values > 10 ~ "High",
  TRUE ~ "Low"
)
print(result)
# [1] "Low"     "Unknown" "High"    "Low"     "Unknown"

Performance Comparison

Vectorized operations vastly outperform loops for large datasets:

library(microbenchmark)

n <- 100000
values <- runif(n, 0, 100)

# Loop approach
loop_approach <- function(x) {
  result <- character(length(x))
  for (i in seq_along(x)) {
    if (x[i] < 33) {
      result[i] <- "Low"
    } else if (x[i] < 67) {
      result[i] <- "Medium"
    } else {
      result[i] <- "High"
    }
  }
  result
}

# Vectorized approaches
ifelse_approach <- function(x) {
  ifelse(x < 33, "Low", ifelse(x < 67, "Medium", "High"))
}

case_when_approach <- function(x) {
  case_when(
    x < 33 ~ "Low",
    x < 67 ~ "Medium",
    TRUE ~ "High"
  )
}

microbenchmark(
  loop = loop_approach(values),
  ifelse = ifelse_approach(values),
  case_when = case_when_approach(values),
  times = 10
)
# Results (median times):
# loop:       ~450ms
# ifelse:     ~4ms
# case_when:  ~8ms

The vectorized approaches are 50-100x faster. Use loops only when conditions depend on previous iterations or require complex state management.

Switch Statements for Discrete Values

For matching discrete values, switch() provides cleaner syntax than if/else chains:

get_day_type <- function(day) {
  switch(day,
    "Monday" = "Start of week",
    "Friday" = "End of week",
    "Saturday" = ,  # Fall through
    "Sunday" = "Weekend",
    "Weekday"  # Default
  )
}

print(get_day_type("Monday"))    # "Start of week"
print(get_day_type("Saturday"))  # "Weekend"
print(get_day_type("Tuesday"))   # "Weekday"

Numeric indices work but are error-prone:

switch(2, "First", "Second", "Third")  # "Second"

For data frame operations, use case_when() with exact matches or recode() from dplyr.

Practical Application: Data Cleaning Pipeline

Combining conditional logic in a real-world scenario:

library(dplyr)

sales_data <- data.frame(
  order_id = 1:6,
  amount = c(150, -50, 2500, 75, NA, 180),
  region = c("North", "South", "West", "East", "North", NA),
  customer_type = c("New", "Returning", "VIP", "New", "Returning", "New")
)

cleaned_data <- sales_data %>%
  mutate(
    # Flag invalid amounts
    valid_amount = !is.na(amount) & amount > 0,
    
    # Categorize order size
    order_size = case_when(
      !valid_amount ~ "Invalid",
      amount < 100 ~ "Small",
      amount < 500 ~ "Medium",
      amount < 1000 ~ "Large",
      TRUE ~ "Enterprise"
    ),
    
    # Handle missing regions
    region = if_else(is.na(region), "Unknown", region),
    
    # Calculate discount eligibility
    discount_eligible = case_when(
      !valid_amount ~ FALSE,
      customer_type == "VIP" ~ TRUE,
      customer_type == "Returning" & amount > 100 ~ TRUE,
      TRUE ~ FALSE
    )
  )

print(cleaned_data)

This pattern—combining validation, categorization, and business logic—forms the backbone of most data preparation workflows in R.