How to Decompose a Time Series in Python

Key Insights

Time series decomposition separates data into trend, seasonal, and residual components, making it easier to understand underlying patterns and build better forecasting models
Classical decomposition works well for stable patterns, but STL decomposition handles non-linear trends and changing seasonality more robustly
Always inspect residuals after decomposition—they should resemble white noise; patterns in residuals indicate your model missed important information

Introduction to Time Series Decomposition

Time series decomposition is the process of breaking down a time series into its constituent components: trend, seasonality, and residuals. This technique is fundamental to understanding temporal data because it isolates different patterns that occur at different time scales.

Why decompose? First, it makes complex patterns interpretable. A messy sales chart becomes clear when you separate the upward growth trend from monthly seasonal peaks. Second, decomposition improves forecasting accuracy by letting you model each component separately. Third, it enables better anomaly detection—unusual values in the residuals stand out more clearly than in the raw data.

Let’s start by loading and visualizing a classic time series dataset:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.datasets import co2

# Load atmospheric CO2 data
data = co2.load().data
data = data.fillna(data.interpolate())

plt.figure(figsize=(12, 4))
plt.plot(data.index, data.values)
plt.title('Atmospheric CO2 Concentration')
plt.xlabel('Date')
plt.ylabel('CO2 (ppm)')
plt.tight_layout()
plt.show()

This dataset shows a clear upward trend with regular seasonal fluctuations—perfect for decomposition.

Understanding the Components

Every time series decomposition separates data into three components:

Trend represents the long-term direction of your data. It captures sustained increases, decreases, or stability over time. Trends can be linear (steady growth) or non-linear (accelerating or decelerating). In sales data, trend shows whether your business is growing. In climate data, it reveals long-term warming or cooling.

Seasonality captures regular, repeating patterns at fixed intervals. Monthly retail sales spike in December. Website traffic drops on weekends. Electricity demand peaks in summer and winter. Seasonality is predictable and cyclical, occurring at known frequencies (daily, weekly, monthly, quarterly, yearly).

Residuals are what’s left after removing trend and seasonality. They represent irregular fluctuations, random noise, or unexplained variation. Ideally, residuals should look like white noise—no patterns, just random variation around zero. Patterns in residuals indicate your decomposition missed something important.

Two fundamental models exist for combining these components:

Additive model: Y(t) = Trend(t) + Seasonal(t) + Residual(t)

Use this when seasonal variations remain roughly constant regardless of the trend level. If your sales increase 20% every December whether you’re selling 100 or 1000 units, that’s additive.

Multiplicative model: Y(t) = Trend(t) × Seasonal(t) × Residual(t)

Use this when seasonal variations scale with the trend. If December sales are always 20% higher than the trend level (so the absolute increase grows as your business grows), that’s multiplicative.

Here’s how these components look in isolation:

# Generate synthetic data to illustrate components
np.random.seed(42)
time = np.arange(0, 100)
trend = time * 0.5
seasonal = 10 * np.sin(2 * np.pi * time / 12)
residual = np.random.normal(0, 2, 100)
additive_series = trend + seasonal + residual

fig, axes = plt.subplots(4, 1, figsize=(12, 10))
axes[0].plot(time, additive_series)
axes[0].set_title('Original Series')
axes[1].plot(time, trend)
axes[1].set_title('Trend Component')
axes[2].plot(time, seasonal)
axes[2].set_title('Seasonal Component')
axes[3].plot(time, residual)
axes[3].set_title('Residual Component')
plt.tight_layout()
plt.show()

Classical Decomposition with statsmodels

The seasonal_decompose() function from statsmodels provides straightforward classical decomposition using moving averages. It’s simple, fast, and works well for data with stable patterns.

from statsmodels.tsa.seasonal import seasonal_decompose

# Perform decomposition
decomposition = seasonal_decompose(
    data, 
    model='additive',  # or 'multiplicative'
    period=12,  # monthly seasonality in annual data
    extrapolate_trend='freq'  # handle endpoints
)

# Plot all components
fig, axes = plt.subplots(4, 1, figsize=(12, 10))
decomposition.observed.plot(ax=axes[0])
axes[0].set_ylabel('Observed')
decomposition.trend.plot(ax=axes[1])
axes[1].set_ylabel('Trend')
decomposition.seasonal.plot(ax=axes[2])
axes[2].set_ylabel('Seasonal')
decomposition.resid.plot(ax=axes[3])
axes[3].set_ylabel('Residual')
plt.tight_layout()
plt.show()

The period parameter is critical—it defines the seasonal cycle length. For monthly data with yearly seasonality, use 12. For daily data with weekly patterns, use 7. Get this wrong and your decomposition will be meaningless.

Classical decomposition has limitations. It uses simple moving averages, which can’t handle non-linear trends well. The seasonal component is assumed constant throughout the series. And it loses observations at the beginning and end of your data (though extrapolate_trend helps).

STL Decomposition (Seasonal and Trend decomposition using Loess)

STL decomposition is more sophisticated and robust. It uses LOESS (locally weighted regression) to extract trend and seasonal components, allowing them to change over time.

from statsmodels.tsa.seasonal import STL

# Perform STL decomposition
stl = STL(
    data, 
    seasonal=13,  # must be odd, controls seasonal smoothness
    trend=None,  # auto-calculated if None
    robust=True  # resistant to outliers
)
stl_result = stl.fit()

# Plot results
fig, axes = plt.subplots(4, 1, figsize=(12, 10))
stl_result.observed.plot(ax=axes[0])
axes[0].set_ylabel('Observed')
stl_result.trend.plot(ax=axes[1])
axes[1].set_ylabel('Trend')
stl_result.seasonal.plot(ax=axes[2])
axes[2].set_ylabel('Seasonal')
stl_result.resid.plot(ax=axes[3])
axes[3].set_ylabel('Residual')
plt.tight_layout()
plt.show()

STL advantages:

Handles non-linear trends naturally
Allows seasonal component to change over time
Robust to outliers when robust=True
No data loss at endpoints
More flexible parameter control

The seasonal parameter controls how quickly the seasonal component can change. Larger values (more smoothing) assume more stable seasonality. Smaller odd values allow more variation. Start with period + 1 and adjust based on your data.

Extracting and Using Decomposed Components

Decomposition isn’t just for visualization—you can extract and use individual components for analysis and modeling.

# Extract components
trend = stl_result.trend
seasonal = stl_result.seasonal
residual = stl_result.resid

# Deseasonalize the data (remove seasonal component)
deseasonalized = data - seasonal

plt.figure(figsize=(12, 4))
plt.plot(data.index, data.values, label='Original', alpha=0.5)
plt.plot(deseasonalized.index, deseasonalized.values, 
         label='Deseasonalized', linewidth=2)
plt.legend()
plt.title('Original vs Deseasonalized Series')
plt.show()

# Analyze trend for forecasting
from scipy import stats
x = np.arange(len(trend))
slope, intercept, r_value, p_value, std_err = stats.linregress(
    x, trend.dropna()
)
print(f"Trend slope: {slope:.4f} ppm/month")
print(f"R-squared: {r_value**2:.4f}")

# Check residuals for stationarity (should be white noise)
from statsmodels.stats.diagnostic import acorr_ljungbox
lb_test = acorr_ljungbox(residual.dropna(), lags=[10], return_df=True)
print(f"\nLjung-Box test p-value: {lb_test['lb_pvalue'].values[0]:.4f}")
if lb_test['lb_pvalue'].values[0] > 0.05:
    print("Residuals appear to be white noise (good!)")
else:
    print("Residuals show autocorrelation (model may be incomplete)")

Deseasonalized data is crucial for many applications. It lets you see underlying trends without seasonal noise. Use it for year-over-year comparisons or when building models that don’t explicitly handle seasonality.

Practical Applications and Best Practices

Choosing additive vs multiplicative: Plot your data. If seasonal swings grow proportionally with the trend level, use multiplicative. If they stay constant, use additive. When in doubt, try both and check which produces more random-looking residuals.

Selecting seasonal period: This should match your data’s natural cycle. For business data, try 12 (monthly), 4 (quarterly), or 7 (daily with weekly patterns). For hourly data, try 24 (daily patterns) or 168 (weekly patterns). Wrong period selection produces meaningless results.

Handling missing data: Interpolate before decomposition. Both classical and STL methods require complete time series. Use fillna() with interpolation or forward-fill, but document your approach.

Complete workflow example:

# Load your data
df = pd.read_csv('sales_data.csv', parse_dates=['date'], index_col='date')
sales = df['sales'].asfreq('D')  # ensure regular frequency

# Handle missing values
sales = sales.interpolate(method='time')

# Decompose
stl = STL(sales, seasonal=13, robust=True)
result = stl.fit()

# Identify anomalies in residuals (>3 standard deviations)
residual_std = result.resid.std()
anomalies = result.resid[abs(result.resid) > 3 * residual_std]

print(f"Found {len(anomalies)} anomalies:")
print(anomalies.sort_values(ascending=False).head())

# Create forecast using trend
from sklearn.linear_model import LinearRegression
X = np.arange(len(result.trend)).reshape(-1, 1)
y = result.trend.values
model = LinearRegression().fit(X, y)

# Forecast next 30 days
future_X = np.arange(len(result.trend), 
                     len(result.trend) + 30).reshape(-1, 1)
trend_forecast = model.predict(future_X)

# Add back average seasonal component
seasonal_avg = result.seasonal[-30:].values  # last month's pattern
forecast = trend_forecast + seasonal_avg

print(f"\n30-day forecast: {forecast}")

Time series decomposition is a foundational technique that should be in every data scientist’s toolkit. Start with STL decomposition for most applications—it’s robust and flexible. Always validate your decomposition by inspecting residuals. And remember: decomposition reveals patterns, but you still need domain knowledge to interpret what those patterns mean for your specific problem.