R - difftime() - Difference Between Dates

Calculating the difference between dates is one of the most common operations in data analysis. Whether you're measuring customer lifetime, calculating project durations, or analyzing time-to-event...

Key Insights

  • The difftime() function calculates time intervals between two date or datetime objects, returning a “difftime” object with explicit units that you control via the units parameter.
  • Always ensure your date objects are properly typed using as.Date() for dates or as.POSIXct() for datetimes before passing them to difftime()—string inputs will fail silently or produce unexpected results.
  • For complex date arithmetic involving months or years, consider the lubridate package instead, as difftime() only supports units up to weeks and doesn’t account for variable month lengths.

Introduction to difftime()

Calculating the difference between dates is one of the most common operations in data analysis. Whether you’re measuring customer lifetime, calculating project durations, or analyzing time-to-event data, you need reliable date arithmetic. R’s built-in difftime() function handles this cleanly without external dependencies.

The function takes two date or datetime objects and returns the interval between them in your specified units. It’s straightforward, predictable, and works well for most use cases involving days, hours, minutes, or seconds. Understanding difftime() is essential before reaching for heavier packages—often it’s all you need.

Function Syntax and Parameters

The difftime() function signature is:

difftime(time1, time2, tz, units = c("auto", "secs", "mins", "hours", "days", "weeks"))

Here’s what each parameter does:

  • time1: The later date/time object (the “end” date)
  • time2: The earlier date/time object (the “start” date)
  • tz: Optional timezone specification for the calculation
  • units: The unit of measurement for the result

The calculation is time1 - time2, so if time1 is later than time2, you get a positive result. The units parameter defaults to “auto”, which picks the most appropriate unit based on the magnitude of the difference. I recommend always specifying units explicitly—“auto” can surprise you when your data spans different ranges.

# Basic syntax demonstration
start_date <- as.Date("2024-01-15")
end_date <- as.Date("2024-03-20")

# Calculate difference in days
diff_days <- difftime(end_date, start_date, units = "days")
print(diff_days)
# Time difference of 65 days

# The result is a difftime object
class(diff_days)
# [1] "difftime"

# Extract numeric value when needed
as.numeric(diff_days)
# [1] 65

Creating Date/Time Objects for difftime()

Before using difftime(), you need properly typed date objects. R provides three main functions for this:

as.Date() creates date objects without time components. Use this when you only care about calendar dates.

as.POSIXct() creates datetime objects stored as seconds since Unix epoch. This is the most common choice for timestamps.

as.POSIXlt() creates datetime objects stored as a list of components (year, month, day, hour, etc.). Useful when you need to extract individual components, but less efficient for large datasets.

# Converting strings to date objects

# Simple date format (default is YYYY-MM-DD)
date1 <- as.Date("2024-06-15")

# Custom date format
date2 <- as.Date("15/06/2024", format = "%d/%m/%Y")

# US format
date3 <- as.Date("06-15-2024", format = "%m-%d-%Y")

# Creating datetime objects with POSIXct
datetime1 <- as.POSIXct("2024-06-15 14:30:00")
datetime2 <- as.POSIXct("2024-06-15 09:15:00")

# With explicit timezone
datetime3 <- as.POSIXct("2024-06-15 14:30:00", tz = "America/New_York")

# Verify the classes
class(date1)      # [1] "Date"
class(datetime1)  # [1] "POSIXct" "POSIXt"

A common mistake is passing character strings directly to difftime(). While R sometimes coerces them automatically, this behavior is inconsistent and leads to bugs. Always convert explicitly.

Basic Usage and Unit Conversions

The power of difftime() lies in its flexible unit specification. The same date pair produces different numeric results depending on your chosen unit:

# Define two datetime objects
start <- as.POSIXct("2024-01-01 00:00:00")
end <- as.POSIXct("2024-01-08 12:00:00")

# Calculate in different units
diff_weeks <- difftime(end, start, units = "weeks")
diff_days <- difftime(end, start, units = "days")
diff_hours <- difftime(end, start, units = "hours")
diff_mins <- difftime(end, start, units = "mins")
diff_secs <- difftime(end, start, units = "secs")

# Print results
cat("Weeks:", as.numeric(diff_weeks), "\n")   # 1.071429
cat("Days:", as.numeric(diff_days), "\n")     # 7.5
cat("Hours:", as.numeric(diff_hours), "\n")   # 180
cat("Minutes:", as.numeric(diff_mins), "\n")  # 10800
cat("Seconds:", as.numeric(diff_secs), "\n")  # 648000

# You can also use the subtraction operator directly
# This returns a difftime object with auto units
simple_diff <- end - start
print(simple_diff)
# Time difference of 7.5 days

# Change units on an existing difftime object
units(simple_diff) <- "hours"
print(simple_diff)
# Time difference of 180 hours

Notice that difftime objects retain their unit information. This is helpful for display but can cause confusion in calculations. When doing arithmetic, convert to numeric first with as.numeric().

Working with difftime in Data Frames

Real-world analysis rarely involves single date pairs. You’ll typically calculate differences across entire columns. difftime() is vectorized, making this straightforward:

# Create sample customer data
customers <- data.frame(
  customer_id = 1:5,
  signup_date = as.Date(c("2023-01-15", "2023-03-22", "2023-06-01", 
                          "2023-09-10", "2024-01-05")),
  last_purchase = as.Date(c("2024-06-01", "2024-05-15", "2024-06-10",
                            "2024-04-20", "2024-06-15"))
)

# Calculate customer tenure (days since signup)
today <- as.Date("2024-06-20")
customers$tenure_days <- as.numeric(difftime(today, customers$signup_date, 
                                              units = "days"))

# Calculate days since last purchase
customers$days_since_purchase <- as.numeric(difftime(today, customers$last_purchase,
                                                      units = "days"))

# Calculate active period (signup to last purchase)
customers$active_period <- as.numeric(difftime(customers$last_purchase, 
                                                customers$signup_date,
                                                units = "days"))

print(customers)
#   customer_id signup_date last_purchase tenure_days days_since_purchase active_period
# 1           1  2023-01-15    2024-06-01         522                  19           503
# 2           2  2023-03-22    2024-05-15         456                  36           420
# 3           3  2023-06-01    2024-06-10         385                  10           375
# 4           4  2023-09-10    2024-04-20         284                  61           223
# 5           5  2024-01-05    2024-06-15         167                   5           162

# Calculate average tenure
mean(customers$tenure_days)
# [1] 362.8

This pattern—converting the difftime result to numeric immediately—is the cleanest approach for dataframe operations. It avoids unit confusion and produces standard numeric columns that work with all R functions.

Handling Edge Cases and Time Zones

Time zones and daylight saving time create subtle bugs that are difficult to debug. Here’s how to handle them:

# Timezone-aware calculations
# Create times in different zones
ny_time <- as.POSIXct("2024-03-10 01:30:00", tz = "America/New_York")
la_time <- as.POSIXct("2024-03-10 01:30:00", tz = "America/Los_Angeles")

# These are actually 3 hours apart
diff_tz <- difftime(ny_time, la_time, units = "hours")
print(diff_tz)
# Time difference of -3 hours

# DST transition example (US springs forward on March 10, 2024)
before_dst <- as.POSIXct("2024-03-10 01:00:00", tz = "America/New_York")
after_dst <- as.POSIXct("2024-03-10 03:00:00", tz = "America/New_York")

# Only 1 hour passed, not 2
difftime(after_dst, before_dst, units = "hours")
# Time difference of 1 hours

# Handling NA values
dates_with_na <- data.frame(
  start = as.Date(c("2024-01-01", "2024-02-01", NA, "2024-04-01")),
  end = as.Date(c("2024-01-15", NA, "2024-03-15", "2024-04-10"))
)

# difftime propagates NA correctly
dates_with_na$duration <- as.numeric(difftime(dates_with_na$end, 
                                               dates_with_na$start, 
                                               units = "days"))
print(dates_with_na)
#        start        end duration
# 1 2024-01-01 2024-01-15       14
# 2 2024-02-01       <NA>       NA
# 3       <NA> 2024-03-15       NA
# 4 2024-04-01 2024-04-10        9

# Use na.rm in aggregations
mean(dates_with_na$duration, na.rm = TRUE)
# [1] 11.5

My recommendation: store all timestamps in UTC internally, convert to local time only for display. This eliminates DST bugs entirely.

Alternatives and Best Practices

Base R’s difftime() works well for simple cases, but the lubridate package offers more expressive syntax and handles months and years properly:

library(lubridate)

start <- as.Date("2023-06-15")
end <- as.Date("2024-03-20")

# Base R approach
base_diff <- difftime(end, start, units = "days")
print(base_diff)
# Time difference of 279 days

# lubridate interval approach
lub_interval <- interval(start, end)
print(lub_interval)
# [1] 2023-06-15 UTC--2024-03-20 UTC

# lubridate gives you months and years
as.period(lub_interval, unit = "months")
# [1] "9m 5d 0H 0M 0S"

as.period(lub_interval, unit = "years")
# [1] "0y 9m 5d 0H 0M 0S"

# lubridate's shorthand operator
start %--% end / days(1)   # 279
start %--% end / weeks(1)  # 39.85714
start %--% end / months(1) # 9.166667

# Time arithmetic with lubridate
end + months(3)  # Adds 3 calendar months
end + days(90)   # Adds exactly 90 days

When to use each:

  • Use difftime() for simple day/hour/minute calculations, when you want to minimize dependencies, or when working in production environments where package management is restricted.

  • Use lubridate when you need month or year arithmetic, when working with complex timezone logic, or when code readability is paramount.

For most data analysis work, difftime() handles 80% of use cases without adding dependencies. Learn it thoroughly before reaching for alternatives.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.