R - Variables and Assignment Operators

• R uses `<-` as the primary assignment operator by convention, though `=` works in most contexts—understanding the subtle differences prevents unexpected scoping issues

Key Insights

• R uses <- as the primary assignment operator by convention, though = works in most contexts—understanding the subtle differences prevents unexpected scoping issues • Variable names in R are case-sensitive and can include letters, numbers, dots, and underscores, but must start with a letter or dot (not followed by a number) • R’s dynamic typing system allows variables to change types freely, but lack of explicit type checking requires defensive programming practices to avoid runtime errors

Assignment Operators: <- vs = vs «-

R provides three assignment operators, each with distinct behavior. The <- operator is the standard assignment method in R, deeply embedded in the language’s culture and syntax.

# Standard assignment with <-
x <- 10
y <- "hello"
z <- c(1, 2, 3, 4, 5)

# Assignment with = (works in most contexts)
x = 10
y = "hello"

# These are equivalent at the top level
identical(x <- 10, x = 10)  # TRUE

The critical difference emerges in function calls. The = operator is reserved for named arguments within functions, while <- always performs assignment:

# Using = for function arguments
mean(x = c(1, 2, 3, 4, 5))  # Correct: x is a parameter name

# Using <- in function calls (creates variable in parent scope)
mean(x <- c(1, 2, 3, 4, 5))  # Creates x variable AND passes unnamed argument

# Verify the variable was created
print(x)  # [1] 1 2 3 4 5

The <<- operator performs superassignment, searching parent environments until it finds an existing variable to modify or reaches the global environment:

# Demonstrating <<- behavior
counter <- 0

increment <- function() {
  counter <<- counter + 1
  return(counter)
}

increment()  # 1
increment()  # 2
print(counter)  # 2 (modified in global scope)

# Contrast with standard <-
reset <- function() {
  counter <- 0  # Creates local variable, doesn't affect global
  return(counter)
}

reset()  # 0
print(counter)  # Still 2

Variable Naming Conventions and Rules

R variable names must follow specific rules while offering flexibility for different naming styles:

# Valid variable names
user_count <- 100           # Snake case (recommended)
userCount <- 100            # Camel case
user.count <- 100           # Dot notation (traditional R style)
UserCount <- 100            # Pascal case
.hidden_var <- 100          # Leading dot (hidden from ls())
variable123 <- 100          # Alphanumeric

# Invalid variable names (will cause errors)
# 2users <- 100             # Cannot start with number
# user-count <- 100         # Hyphens not allowed
# _private <- 100           # Cannot start with underscore
# .2fast <- 100             # Dot followed by number not allowed

# Reserved words cannot be used
# if <- 10                  # Error: reserved word
# function <- 10            # Error: reserved word
# TRUE <- 10                # Error: reserved word

Case sensitivity matters significantly:

data <- c(1, 2, 3)
Data <- c(4, 5, 6)
DATA <- c(7, 8, 9)

print(data)  # [1] 1 2 3
print(Data)  # [1] 4 5 6
print(DATA)  # [1] 7 8 9

Dynamic Typing and Type Coercion

R uses dynamic typing, allowing variables to change types without explicit declaration:

# Variable changes type freely
value <- 42                    # Numeric
print(class(value))            # "numeric"

value <- "forty-two"           # Character
print(class(value))            # "character"

value <- TRUE                  # Logical
print(class(value))            # "logical"

value <- list(a = 1, b = 2)   # List
print(class(value))            # "list"

Type coercion happens automatically in many operations, following a hierarchy: logical < integer < numeric < character:

# Automatic coercion examples
mixed <- c(TRUE, 1, 2.5, "text")
print(mixed)          # All converted to character
print(class(mixed))   # "character"

numeric_logical <- c(TRUE, FALSE, 1, 2, 3)
print(numeric_logical)  # [1] 1 0 1 2 3 (logical to numeric)
print(class(numeric_logical))  # "numeric"

# Mathematical operations coerce logicals
sum(c(TRUE, TRUE, FALSE, TRUE))  # 3
mean(c(TRUE, FALSE, TRUE, TRUE)) # 0.75

Explicit type checking and conversion prevents unexpected behavior:

# Type checking functions
value <- "123"

is.numeric(value)    # FALSE
is.character(value)  # TRUE
is.logical(value)    # FALSE

# Explicit conversion
num_value <- as.numeric(value)     # 123
int_value <- as.integer(value)     # 123L
char_value <- as.character(42)     # "42"

# Safe conversion with error handling
safe_convert <- function(x) {
  result <- suppressWarnings(as.numeric(x))
  if (is.na(result) && !is.na(x)) {
    stop(paste("Cannot convert", x, "to numeric"))
  }
  return(result)
}

safe_convert("123")    # 123
safe_convert("abc")    # Error: Cannot convert abc to numeric

Multiple Assignment and Destructuring

R doesn’t have built-in destructuring like modern languages, but workarounds exist:

# Multiple assignment using list indexing
result <- list(mean = 5.5, sd = 2.1, n = 100)
mean_val <- result$mean
sd_val <- result$sd
n_val <- result$n

# Using with() for temporary scope
with(result, {
  print(paste("Mean:", mean))
  print(paste("SD:", sd))
  print(paste("N:", n))
})

# Parallel assignment using zeallot package (if available)
# library(zeallot)
# c(mean_val, sd_val, n_val) %<-% c(5.5, 2.1, 100)

Vector assignment allows updating multiple elements:

# Vector element assignment
numbers <- c(10, 20, 30, 40, 50)
numbers[c(1, 3, 5)] <- c(100, 300, 500)
print(numbers)  # [1] 100  20 300  40 500

# Conditional assignment
numbers[numbers < 100] <- 0
print(numbers)  # [1] 100   0 300   0 500

# Named vector assignment
scores <- c(alice = 95, bob = 87, charlie = 92)
scores["bob"] <- 90
print(scores)  # alice 95, bob 90, charlie 92

Variable Scope and Environments

Understanding scope prevents subtle bugs:

# Global scope
global_var <- "global"

test_scope <- function() {
  # Local scope
  local_var <- "local"
  
  # Accessing global variable
  print(global_var)  # "global"
  
  # Modifying global requires <<-
  global_var <<- "modified"
  
  # Nested function scope
  inner_function <- function() {
    inner_var <- "inner"
    print(local_var)   # Accesses parent function scope
    print(global_var)  # Accesses global scope
  }
  
  inner_function()
}

test_scope()
print(global_var)  # "modified"

Check variable existence before use:

# Safe variable access
if (exists("undefined_var")) {
  print(undefined_var)
} else {
  print("Variable does not exist")
}

# Get with default value
value <- get0("possibly_undefined", ifnotfound = 0)

# Remove variables
rm(value)
exists("value")  # FALSE

# Clear workspace
# rm(list = ls())  # Removes all variables

Practical Patterns

Implement constants using naming conventions:

# Constants (uppercase by convention, not enforced)
MAX_ITERATIONS <- 1000
PI_APPROX <- 3.14159
DEFAULT_TIMEOUT <- 30

# Validation function
validate_input <- function(x, max_val = MAX_ITERATIONS) {
  if (!is.numeric(x)) {
    stop("Input must be numeric")
  }
  if (x > max_val) {
    warning(paste("Value exceeds maximum:", max_val))
    return(max_val)
  }
  return(x)
}

iterations <- validate_input(1500)  # Warning, returns 1000

This foundation in R’s assignment operators and variable behavior enables writing robust, maintainable code that leverages R’s dynamic nature while avoiding common pitfalls.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.