R - Create Custom Package
R packages aren't just for CRAN distribution. Any collection of functions you use repeatedly across projects benefits from package structure. You get automatic dependency management, integrated help...
Key Insights
- Custom R packages provide reusable, documented code with built-in testing frameworks, making them superior to scattered script files for any serious development work
- The
usethisanddevtoolspackages automate package scaffolding, documentation generation, and testing setup, reducing package creation to a few function calls - Proper package structure with NAMESPACE management and roxygen2 documentation ensures your code integrates seamlessly with R’s ecosystem and remains maintainable
Why Build a Custom Package
R packages aren’t just for CRAN distribution. Any collection of functions you use repeatedly across projects benefits from package structure. You get automatic dependency management, integrated help documentation, unit testing infrastructure, and version control—all standard features that would require manual setup otherwise.
A package transforms loose R scripts into a professional codebase. Instead of source("utils.R") scattered across projects, you write library(yourpackage) once. Functions become discoverable through autocomplete and ?function_name help pages.
Setting Up the Development Environment
Install the essential package development tools:
install.packages(c("devtools", "usethis", "roxygen2", "testthat"))
These packages handle the heavy lifting. devtools provides the build and check functions, usethis automates file creation, roxygen2 generates documentation from code comments, and testthat structures your unit tests.
Create your package skeleton:
library(usethis)
create_package("~/projects/datautils")
This generates the minimal package structure:
datautils/
├── DESCRIPTION
├── NAMESPACE
├── R/
└── datautils.Rproj
The DESCRIPTION file contains package metadata. The R/ directory holds your function definitions. The NAMESPACE file controls which functions are exported—let roxygen2 manage this automatically.
Writing Package Functions
Create a new R file in the R/ directory for your functions. Use roxygen2 comments (starting with #') above each function to generate documentation:
#' Calculate Moving Average
#'
#' Computes a simple moving average over a specified window.
#'
#' @param x Numeric vector of values
#' @param window Integer window size for the moving average
#' @param na.rm Logical indicating whether to remove NA values
#'
#' @return Numeric vector of moving averages with length equal to input
#'
#' @examples
#' moving_avg(c(1, 2, 3, 4, 5), window = 3)
#' moving_avg(c(1, NA, 3, 4, 5), window = 2, na.rm = TRUE)
#'
#' @export
moving_avg <- function(x, window = 3, na.rm = FALSE) {
if (!is.numeric(x)) {
stop("Input must be numeric")
}
if (window < 1 || window > length(x)) {
stop("Window must be between 1 and length of input")
}
result <- numeric(length(x))
for (i in seq_along(x)) {
start_idx <- max(1, i - window + 1)
end_idx <- i
window_values <- x[start_idx:end_idx]
if (na.rm) {
window_values <- window_values[!is.na(window_values)]
}
result[i] <- mean(window_values, na.rm = na.rm)
}
result
}
The @export tag tells roxygen2 to add this function to the NAMESPACE, making it available when users load your package. Functions without @export remain internal.
Save this file as R/moving_avg.R. Use usethis::use_r("moving_avg") to create files with proper naming conventions automatically.
Managing Dependencies
If your functions use other packages, declare them properly. For packages required at runtime:
usethis::use_package("dplyr")
This adds dplyr to the Imports field in DESCRIPTION. Reference functions from imported packages using the :: operator:
#' Summarize Data by Group
#'
#' @param data A data frame
#' @param group_var Column name to group by (unquoted)
#' @param value_var Column name to summarize (unquoted)
#'
#' @return A data frame with grouped summaries
#' @export
summarize_groups <- function(data, group_var, value_var) {
data |>
dplyr::group_by({{ group_var }}) |>
dplyr::summarize(
mean = mean({{ value_var }}, na.rm = TRUE),
median = median({{ value_var }}, na.rm = TRUE),
n = dplyr::n()
)
}
For packages only needed during development or testing:
usethis::use_package("ggplot2", type = "Suggests")
Generating Documentation
Convert roxygen2 comments into proper R documentation:
devtools::document()
This creates .Rd files in the man/ directory and updates NAMESPACE. Run this command every time you modify roxygen2 comments. The generated help files follow R’s standard format and appear when users type ?moving_avg.
Create a package-level documentation file:
usethis::use_package_doc()
This generates R/datautils-package.R with a roxygen2 skeleton for overall package documentation.
Adding Unit Tests
Set up the testing infrastructure:
usethis::use_testthat()
This creates a tests/ directory structure. Create test files for each function:
usethis::use_test("moving_avg")
Write tests in tests/testthat/test-moving_avg.R:
test_that("moving_avg calculates correctly", {
x <- c(1, 2, 3, 4, 5)
result <- moving_avg(x, window = 3)
expect_equal(result[1], 1)
expect_equal(result[2], 1.5)
expect_equal(result[3], 2)
expect_equal(result[4], 3)
expect_equal(result[5], 4)
})
test_that("moving_avg handles NA values", {
x <- c(1, NA, 3, 4, 5)
result_keep <- moving_avg(x, window = 2, na.rm = FALSE)
expect_true(is.na(result_keep[2]))
result_remove <- moving_avg(x, window = 2, na.rm = TRUE)
expect_equal(result_remove[2], 1)
})
test_that("moving_avg validates inputs", {
expect_error(moving_avg("text", window = 2))
expect_error(moving_avg(1:5, window = 0))
expect_error(moving_avg(1:5, window = 10))
})
Run tests with:
devtools::test()
Tests run automatically during devtools::check(), ensuring your package meets CRAN standards even if you never submit to CRAN.
Building and Installing
Load your package for interactive development:
devtools::load_all()
This simulates installing and loading the package without actually installing it. Changes to code are immediately available after running load_all() again.
Check your package for common issues:
devtools::check()
This runs R CMD check, which validates documentation, tests code, checks for common errors, and ensures CRAN compliance. Address all warnings and errors before sharing your package.
Install the package locally:
devtools::install()
Now you can use library(datautils) in any R session like any other package.
Version Control and README
Initialize git tracking:
usethis::use_git()
Create a README with package overview:
usethis::use_readme_md()
Edit README.md to include installation instructions and usage examples:
## Installation
```r
# Install from GitHub
devtools::install_github("yourusername/datautils")
Usage
library(datautils)
data <- c(10, 15, 13, 18, 20, 22, 19)
smoothed <- moving_avg(data, window = 3)
For GitHub hosting:
```r
usethis::use_github()
This creates a repository and pushes your package code automatically if you have GitHub credentials configured.
Data and Vignettes
Include example datasets:
my_data <- data.frame(x = 1:10, y = rnorm(10))
usethis::use_data(my_data)
This saves my_data to data/ as an .rda file, making it available when users load your package.
Create long-form documentation with vignettes:
usethis::use_vignette("introduction")
Edit the generated R Markdown file in vignettes/ to provide comprehensive usage examples and tutorials.
Custom packages transform R development from script chaos into organized, testable, documented systems. The initial setup investment pays dividends through reusability, maintainability, and professional code structure. Start packaging any function collection you use across multiple projects.