How to Difference a Time Series in Python

Key Insights

Differencing removes trend and seasonality by computing changes between observations, transforming non-stationary time series into stationary ones suitable for forecasting models like ARIMA
First-order differencing handles linear trends, seasonal differencing removes periodic patterns at specific lags, and you can combine both techniques for complex series
Always validate stationarity with statistical tests (ADF, KPSS) and avoid over-differencing, which introduces unnecessary noise and complicates interpretation

Introduction to Time Series Differencing

Time series differencing is the process of transforming a series by computing the differences between consecutive observations. This simple yet powerful technique is fundamental to time series analysis because most statistical forecasting methods require stationarity—a property where statistical characteristics like mean and variance remain constant over time.

Non-stationary series exhibit trends, changing variance, or seasonal patterns that violate stationarity assumptions. Differencing removes these components, revealing the underlying signal that models can learn from effectively.

Let’s visualize the difference between a non-stationary and stationary series:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Create a non-stationary series with trend
np.random.seed(42)
time = np.arange(200)
trend = 0.5 * time
seasonal = 10 * np.sin(2 * np.pi * time / 50)
noise = np.random.normal(0, 3, 200)
non_stationary = trend + seasonal + noise

# Apply first-order differencing
stationary = np.diff(non_stationary)

# Visualize
fig, axes = plt.subplots(2, 1, figsize=(12, 8))

axes[0].plot(non_stationary)
axes[0].set_title('Non-Stationary Series (with trend)', fontsize=14)
axes[0].set_ylabel('Value')

axes[1].plot(stationary)
axes[1].set_title('Stationary Series (after differencing)', fontsize=14)
axes[1].set_ylabel('Differenced Value')
axes[1].set_xlabel('Time')

plt.tight_layout()
plt.show()

The first plot shows clear upward trend, while the differenced series fluctuates around a constant mean—a hallmark of stationarity.

First-Order Differencing

First-order differencing computes the change between consecutive observations using the formula: y'(t) = y(t) - y(t-1). This removes linear trends and is the most commonly applied differencing technique.

Pandas makes this trivial with the .diff() method:

# Load example data - using stock prices
dates = pd.date_range('2023-01-01', periods=100, freq='D')
stock_prices = pd.Series(
    100 + np.cumsum(np.random.randn(100) * 2),
    index=dates
)

# Apply first-order differencing
price_changes = stock_prices.diff()

# The first value is NaN since there's no previous observation
print(price_changes.head())

Output:

2023-01-01         NaN
2023-01-02   -0.496714
2023-01-03    1.291598
2023-01-04   -0.863652
2023-01-05    2.269755

Notice the first value is NaN. Always drop or handle these missing values before modeling:

price_changes = price_changes.dropna()

For financial data, first-order differencing transforms absolute prices into returns—the metric that actually matters for trading decisions. For temperature data, it reveals day-to-day changes rather than absolute readings.

Seasonal Differencing

When your data exhibits repeating patterns at fixed intervals, seasonal differencing removes these cycles by subtracting observations separated by the seasonal lag. For monthly data with yearly seasonality, use lag=12. For daily data with weekly patterns, use lag=7.

The formula is: y'(t) = y(t) - y(t-s) where s is the seasonal period.

# Classic airline passengers dataset
from statsmodels.datasets import co2

# Load CO2 dataset (weekly seasonality)
data = co2.load_pandas().data
co2_series = data['co2'].resample('W').mean().ffill()

# Apply seasonal differencing (52 weeks)
seasonal_diff = co2_series.diff(52)

# Visualize
fig, axes = plt.subplots(2, 1, figsize=(12, 8))

axes[0].plot(co2_series)
axes[0].set_title('Original CO2 Levels', fontsize=14)

axes[1].plot(seasonal_diff)
axes[1].set_title('After Seasonal Differencing (lag=52)', fontsize=14)

plt.tight_layout()
plt.show()

For series with both trend and seasonality, combine both techniques:

# First remove seasonal component
seasonal_diff = co2_series.diff(52)

# Then remove remaining trend
fully_differenced = seasonal_diff.diff()

# Drop NaN values
fully_differenced = fully_differenced.dropna()

The order matters less in practice, but convention applies seasonal differencing first, then regular differencing.

Higher-Order Differencing

When first-order differencing doesn’t achieve stationarity—typically with quadratic or exponential trends—apply second-order differencing. This differences the already-differenced series:

# Create series with quadratic trend
time = np.arange(150)
quadratic_trend = 0.02 * time**2
noise = np.random.normal(0, 5, 150)
quadratic_series = pd.Series(quadratic_trend + noise)

# First-order differencing
first_diff = quadratic_series.diff()

# Second-order differencing
second_diff = first_diff.diff()

# Visualize all three
fig, axes = plt.subplots(3, 1, figsize=(12, 10))

axes[0].plot(quadratic_series)
axes[0].set_title('Original Series (quadratic trend)')

axes[1].plot(first_diff)
axes[1].set_title('First-Order Differencing (still trending)')

axes[2].plot(second_diff)
axes[2].set_title('Second-Order Differencing (stationary)')

plt.tight_layout()
plt.show()

You can also use .diff() with the periods parameter directly:

# Equivalent to applying .diff() twice
second_diff = quadratic_series.diff().diff()

However, be cautious: over-differencing introduces unnecessary complexity and noise. Most real-world series need at most second-order differencing.

Testing for Stationarity

Don’t guess whether your series is stationary—test it. The Augmented Dickey-Fuller (ADF) test is the industry standard. The null hypothesis states the series has a unit root (non-stationary). A p-value below 0.05 rejects this, confirming stationarity.

from statsmodels.tsa.stattools import adfuller, kpss

def test_stationarity(series, name='Series'):
    """Perform ADF and KPSS tests for stationarity."""
    # ADF test
    adf_result = adfuller(series.dropna(), autolag='AIC')
    
    print(f'\n{name} - ADF Test Results:')
    print(f'ADF Statistic: {adf_result[0]:.6f}')
    print(f'p-value: {adf_result[1]:.6f}')
    print(f'Stationary: {adf_result[1] < 0.05}')
    
    # KPSS test (null hypothesis: series is stationary)
    kpss_result = kpss(series.dropna(), regression='c', nlags='auto')
    
    print(f'\n{name} - KPSS Test Results:')
    print(f'KPSS Statistic: {kpss_result[0]:.6f}')
    print(f'p-value: {kpss_result[1]:.6f}')
    print(f'Stationary: {kpss_result[1] > 0.05}')

# Test original vs differenced series
test_stationarity(stock_prices, 'Original Stock Prices')
test_stationarity(price_changes, 'Differenced Stock Prices')

The KPSS test complements ADF by having the opposite null hypothesis (stationarity). Use both for confidence—ideally, ADF rejects non-stationarity while KPSS fails to reject stationarity.

Inverting Differenced Data

After forecasting with differenced data, you must transform predictions back to the original scale. This requires the cumulative sum operation, the inverse of differencing.

# Simulate a forecasting scenario
original_series = pd.Series([100, 102, 105, 103, 107, 110])
differenced = original_series.diff().dropna()

print("Differenced:", differenced.values)
# Output: [ 2.  3. -2.  4.  3.]

# Invert differencing
def invert_difference(differenced_series, original_first_value):
    """Reconstruct original series from differenced values."""
    # Cumulative sum + first original value
    return np.concatenate([[original_first_value], 
                          differenced_series]).cumsum()[1:]

reconstructed = invert_difference(differenced.values, original_series.iloc[0])
print("Reconstructed:", reconstructed)
# Output: [102. 105. 103. 107. 110.]

For multi-step forecasts:

# Forecast next 3 differenced values
forecasted_diffs = np.array([2.5, 1.8, 3.2])

# Last known original value
last_value = original_series.iloc[-1]

# Reconstruct forecasts
forecasted_values = np.concatenate([[last_value], 
                                   forecasted_diffs]).cumsum()[1:]

print("Forecasted original scale:", forecasted_values)
# Output: [112.5 114.3 117.5]

For seasonal differencing with lag s, reconstruction requires values from s periods back.

Practical Considerations and Best Practices

Avoid over-differencing. Each differencing operation removes information. If your ADF test shows stationarity after first-order differencing, stop there. Over-differenced series show excessive volatility and negative autocorrelation at lag 1.

Integrate differencing into ARIMA models. The “I” in ARIMA stands for “Integrated”—the order of differencing. Let the model handle it:

from statsmodels.tsa.arima.model import ARIMA

# Load data
dates = pd.date_range('2020-01-01', periods=200, freq='D')
sales = pd.Series(
    50 + 0.5*np.arange(200) + np.random.randn(200)*5,
    index=dates
)

# ARIMA(p, d, q) where d is the differencing order
# d=1 means the model applies first-order differencing internally
model = ARIMA(sales, order=(1, 1, 1))
fitted_model = model.fit()

# Forecast (automatically inverts differencing)
forecast = fitted_model.forecast(steps=10)
print(forecast)

# Check model summary
print(fitted_model.summary())

The model automatically differences the data during fitting and inverts transformations during forecasting. You get predictions in the original scale without manual intervention.

Preserve the first values. When differencing, store the initial observations needed for inversion. For first-order differencing, save the first value. For seasonal differencing with lag 12, save the first 12 values.

Consider alternatives for trend removal. Differencing isn’t the only option. Detrending via regression or decomposition preserves more information but requires stronger assumptions about the trend’s functional form. Differencing is assumption-free and robust, making it the preferred choice for most applications.

Differencing is a foundational technique that bridges raw time series data and sophisticated forecasting models. Master it, test rigorously, and integrate it properly into your modeling pipeline for reliable predictions.