R - Write CSV File (write.csv / readr::write_csv)

The `write.csv()` function is R's built-in solution for exporting data frames to CSV format. It's a wrapper around `write.table()` with sensible defaults for comma-separated values.

Key Insights

  • Base R’s write.csv() provides simple CSV export with automatic row names, while readr::write_csv() offers faster performance and cleaner defaults without row names
  • Control critical parameters like delimiters, quoting behavior, NA representation, and encoding to ensure data integrity across different systems and applications
  • For production workflows, implement proper error handling, validate output files, and consider append operations for incremental data exports

Base R write.csv() Fundamentals

The write.csv() function is R’s built-in solution for exporting data frames to CSV format. It’s a wrapper around write.table() with sensible defaults for comma-separated values.

# Basic CSV export
data <- data.frame(
  id = 1:5,
  name = c("Alice", "Bob", "Charlie", "Diana", "Eve"),
  score = c(95.5, 87.3, 92.1, 88.9, 94.2),
  passed = c(TRUE, TRUE, TRUE, TRUE, TRUE)
)

write.csv(data, "output.csv")

By default, write.csv() includes row names as the first column. Disable this behavior with row.names = FALSE:

write.csv(data, "output_no_rownames.csv", row.names = FALSE)

Handling Special Characters and Quoting

CSV files require proper quoting when fields contain delimiters, quotes, or newlines. Control this with the quote parameter:

# Data with special characters
messy_data <- data.frame(
  description = c("Product, Type A", "Item with \"quotes\"", "Normal text"),
  value = c(100, 200, 300)
)

# Quote all character fields (default for write.csv)
write.csv(messy_data, "quoted.csv", row.names = FALSE)

# Quote only fields that need it
write.csv(messy_data, "minimal_quotes.csv", row.names = FALSE, quote = FALSE)

# Explicitly quote specific columns
write.csv(messy_data, "selective_quotes.csv", 
          row.names = FALSE, 
          quote = c(1))  # Quote only first column

Managing NA Values and Missing Data

Different systems expect different representations for missing values. Control this with the na parameter:

data_with_na <- data.frame(
  id = 1:4,
  value1 = c(10, NA, 30, 40),
  value2 = c(NA, 20, 30, NA)
)

# Default: writes "NA"
write.csv(data_with_na, "na_default.csv", row.names = FALSE)

# Empty string for NA
write.csv(data_with_na, "na_empty.csv", row.names = FALSE, na = "")

# Custom NA representation
write.csv(data_with_na, "na_custom.csv", row.names = FALSE, na = "NULL")

Using readr::write_csv() for Performance

The readr package provides write_csv() with better defaults and significantly faster performance for large datasets:

library(readr)

# Basic usage - no row names by default
write_csv(data, "readr_output.csv")

# Explicitly handle NA values
write_csv(data_with_na, "readr_na.csv", na = "")

# Performance comparison
large_data <- data.frame(
  x = rnorm(1000000),
  y = sample(letters, 1000000, replace = TRUE),
  z = runif(1000000)
)

system.time(write.csv(large_data, "base_large.csv", row.names = FALSE))
#   user  system elapsed 
#  12.34    0.45   12.81

system.time(write_csv(large_data, "readr_large.csv"))
#   user  system elapsed 
#   2.15    0.31    2.47

Appending Data to Existing Files

For incremental exports or logging scenarios, append data without overwriting:

# Initial write
initial_data <- data.frame(timestamp = Sys.time(), value = 100)
write.csv(initial_data, "log.csv", row.names = FALSE)

# Append new data (base R)
new_data <- data.frame(timestamp = Sys.time(), value = 200)
write.table(new_data, "log.csv", 
            sep = ",", 
            append = TRUE, 
            col.names = FALSE, 
            row.names = FALSE)

# Append with readr
write_csv(new_data, "log.csv", append = TRUE)

Controlling Delimiters and Formats

While CSV implies comma separation, you can customize delimiters for different requirements:

# Semicolon delimiter (common in European locales)
write.csv2(data, "semicolon.csv", row.names = FALSE)

# Custom delimiter using write.table
write.table(data, "pipe_delimited.txt", 
            sep = "|", 
            row.names = FALSE, 
            quote = FALSE)

# Tab-separated values
write.table(data, "tab_delimited.tsv", 
            sep = "\t", 
            row.names = FALSE, 
            quote = FALSE)

Handling Encoding Issues

Character encoding matters when sharing files across systems or dealing with international characters:

# Data with international characters
intl_data <- data.frame(
  name = c("José", "François", "Müller", "Søren"),
  city = c("São Paulo", "Montréal", "München", "København")
)

# UTF-8 encoding (recommended)
write.csv(intl_data, "utf8.csv", row.names = FALSE, fileEncoding = "UTF-8")

# Latin1 encoding
write.csv(intl_data, "latin1.csv", row.names = FALSE, fileEncoding = "latin1")

# readr defaults to UTF-8
write_csv(intl_data, "readr_utf8.csv")

Production-Ready Export Function

Implement robust CSV export with validation and error handling:

export_csv_safe <- function(data, filepath, validate = TRUE) {
  # Input validation
  if (!is.data.frame(data)) {
    stop("Input must be a data frame")
  }
  
  if (nrow(data) == 0) {
    warning("Exporting empty data frame")
  }
  
  # Create directory if needed
  dir_path <- dirname(filepath)
  if (!dir.exists(dir_path)) {
    dir.create(dir_path, recursive = TRUE)
  }
  
  # Export with error handling
  tryCatch({
    write_csv(data, filepath, na = "")
    
    # Validate output if requested
    if (validate) {
      verify <- read_csv(filepath, show_col_types = FALSE)
      if (nrow(verify) != nrow(data)) {
        stop("Row count mismatch in exported file")
      }
    }
    
    message(sprintf("Successfully exported %d rows to %s", 
                    nrow(data), filepath))
    return(invisible(TRUE))
    
  }, error = function(e) {
    stop(sprintf("Export failed: %s", e$message))
  })
}

# Usage
export_csv_safe(data, "output/validated_export.csv")

Exporting Large Datasets Efficiently

For datasets that approach memory limits, consider chunked writing:

export_large_csv <- function(data, filepath, chunk_size = 100000) {
  n_rows <- nrow(data)
  n_chunks <- ceiling(n_rows / chunk_size)
  
  for (i in seq_len(n_chunks)) {
    start_row <- (i - 1) * chunk_size + 1
    end_row <- min(i * chunk_size, n_rows)
    chunk <- data[start_row:end_row, ]
    
    if (i == 1) {
      write_csv(chunk, filepath)
    } else {
      write_csv(chunk, filepath, append = TRUE)
    }
    
    message(sprintf("Wrote chunk %d/%d", i, n_chunks))
  }
}

# Generate large dataset
very_large_data <- data.frame(
  id = 1:500000,
  value = rnorm(500000)
)

export_large_csv(very_large_data, "large_output.csv")

Comparing write.csv() vs write_csv()

Choose the right function based on your requirements:

Use write.csv() when:

  • Working in base R environments without additional dependencies
  • Need row names preserved in output
  • Compatibility with legacy code is essential

Use readr::write_csv() when:

  • Performance matters for large datasets
  • Want cleaner defaults (no row names)
  • Need consistent UTF-8 encoding
  • Working within tidyverse workflows

Both functions handle the core task effectively, but readr::write_csv() provides better performance and more sensible defaults for modern data workflows. For production systems processing large volumes of data, the performance gains from readr justify the additional dependency.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.