Time Series Decomposition Explained
Time series decomposition is the process of breaking down a time-dependent dataset into distinct components that reveal underlying patterns. Instead of analyzing a complex, noisy signal as a whole,...
Key Insights
- Time series decomposition separates data into trend, seasonal, and residual components, making complex patterns easier to understand and model
- Choose additive decomposition when seasonal variations are constant, multiplicative when they grow proportionally with the trend level
- STL decomposition handles irregular patterns and missing data better than classical methods, making it the preferred choice for real-world applications
Introduction to Time Series Decomposition
Time series decomposition is the process of breaking down a time-dependent dataset into distinct components that reveal underlying patterns. Instead of analyzing a complex, noisy signal as a whole, decomposition lets you isolate and understand each contributing factor separately.
This technique is fundamental for forecasting, anomaly detection, and understanding business metrics. When you decompose retail sales data, for example, you can separate genuine growth trends from predictable holiday spikes and random noise. This clarity leads to better decisions and more accurate predictions.
Let’s start by loading a classic time series dataset and visualizing what we’re working with:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.datasets import co2
# Load CO2 concentration data
data = co2.load().data
data = data.fillna(data.interpolate())
plt.figure(figsize=(12, 4))
plt.plot(data.index, data.values)
plt.title('CO2 Concentration Over Time')
plt.xlabel('Year')
plt.ylabel('CO2 (ppm)')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
This dataset shows atmospheric CO2 levels with clear upward trends and repeating seasonal patterns—perfect for demonstrating decomposition.
The Core Components
Every time series decomposition breaks data into these fundamental components:
Trend represents the long-term direction of your data. Is it growing, declining, or staying flat over time? The trend component smooths out short-term fluctuations to reveal the underlying trajectory. In stock prices, this might be a bull or bear market. In website traffic, it could be organic growth.
Seasonality captures patterns that repeat at fixed, known intervals—daily, weekly, monthly, or yearly. Retail sales spike during holidays. Energy consumption peaks in summer and winter. These predictable patterns are the seasonal component.
Residual (also called irregular or noise) is what’s left after removing trend and seasonality. This includes random fluctuations, measurement errors, and unexpected events that don’t fit the other patterns.
Some analysts also consider Cyclical components—fluctuations that occur at irregular intervals, like economic cycles. However, these are often hard to separate from trends without extensive historical data.
Here’s how to visualize each component:
from statsmodels.tsa.seasonal import seasonal_decompose
# Perform decomposition
decomposition = seasonal_decompose(data, model='additive', period=12)
# Plot all components
fig, axes = plt.subplots(4, 1, figsize=(12, 10))
data.plot(ax=axes[0], title='Original')
decomposition.trend.plot(ax=axes[1], title='Trend')
decomposition.seasonal.plot(ax=axes[2], title='Seasonal')
decomposition.resid.plot(ax=axes[3], title='Residual')
plt.tight_layout()
plt.show()
This visualization immediately reveals patterns that were obscured in the raw data.
Additive vs. Multiplicative Models
The relationship between components determines which decomposition model to use.
Additive decomposition assumes components sum together:
Y(t) = Trend(t) + Seasonal(t) + Residual(t)
Use additive models when seasonal variations remain roughly constant regardless of the trend level. If your retail store sees a $10,000 holiday boost whether annual revenue is $100K or $500K, that’s additive seasonality.
Multiplicative decomposition assumes components multiply:
Y(t) = Trend(t) × Seasonal(t) × Residual(t)
Use multiplicative models when seasonal variations scale with the trend. If holiday sales are always 30% above baseline regardless of absolute values, that’s multiplicative seasonality.
Here’s a direct comparison:
# Create figure with subplots
fig, axes = plt.subplots(2, 4, figsize=(16, 8))
# Additive decomposition
add_decomp = seasonal_decompose(data, model='additive', period=12)
data.plot(ax=axes[0, 0], title='Original')
add_decomp.trend.plot(ax=axes[0, 1], title='Additive Trend')
add_decomp.seasonal.plot(ax=axes[0, 2], title='Additive Seasonal')
add_decomp.resid.plot(ax=axes[0, 3], title='Additive Residual')
# Multiplicative decomposition
mult_decomp = seasonal_decompose(data, model='multiplicative', period=12)
data.plot(ax=axes[1, 0], title='Original')
mult_decomp.trend.plot(ax=axes[1, 1], title='Multiplicative Trend')
mult_decomp.seasonal.plot(ax=axes[1, 2], title='Multiplicative Seasonal')
mult_decomp.resid.plot(ax=axes[1, 3], title='Multiplicative Residual')
plt.tight_layout()
plt.show()
Notice how multiplicative seasonal components are centered around 1.0 (representing percentage changes) while additive components center around 0.0 (representing absolute changes).
Classical Decomposition Methods
Classical decomposition uses moving averages to extract the trend, then derives seasonality and residuals from what remains. Here’s the process:
- Calculate the trend using a centered moving average
- Detrend the data by subtracting (additive) or dividing (multiplicative) the trend
- Calculate average seasonal indices for each period
- Remove seasonality to get residuals
Let’s implement this manually:
def classical_decomposition(series, period=12, model='additive'):
# Calculate trend using centered moving average
trend = series.rolling(window=period, center=True).mean()
# Detrend the data
if model == 'additive':
detrended = series - trend
else:
detrended = series / trend
# Calculate seasonal component
seasonal = detrended.groupby(detrended.index.month).mean()
seasonal_series = pd.Series([seasonal[month] for month in series.index.month],
index=series.index)
# Calculate residuals
if model == 'additive':
residual = series - trend - seasonal_series
else:
residual = series / (trend * seasonal_series)
return trend, seasonal_series, residual
# Apply manual decomposition
trend, seasonal, residual = classical_decomposition(data, period=12, model='additive')
# Compare with statsmodels
print(f"Manual trend mean: {trend.mean():.2f}")
print(f"Statsmodels trend mean: {decomposition.trend.mean():.2f}")
Classical decomposition is simple and interpretable, but it has limitations. It struggles with missing data, doesn’t handle changing seasonal patterns, and loses data at the series boundaries due to the moving average.
STL Decomposition
STL (Seasonal and Trend decomposition using Loess) addresses classical decomposition’s shortcomings. It uses locally weighted regression (LOESS) to extract components, making it more robust and flexible.
Key advantages:
- Handles any type of seasonality, not just fixed patterns
- Robust to outliers
- Allows seasonal component to change over time
- Works with missing data
from statsmodels.tsa.seasonal import STL
# Apply STL decomposition
stl = STL(data, seasonal=13, trend=15, robust=True)
stl_result = stl.fit()
# Plot results
fig, axes = plt.subplots(4, 1, figsize=(12, 10))
data.plot(ax=axes[0], title='Original')
stl_result.trend.plot(ax=axes[1], title='STL Trend')
stl_result.seasonal.plot(ax=axes[2], title='STL Seasonal')
stl_result.resid.plot(ax=axes[3], title='STL Residual')
plt.tight_layout()
plt.show()
# Compare residual variance
print(f"Classical residual std: {decomposition.resid.std():.4f}")
print(f"STL residual std: {stl_result.resid.std():.4f}")
The seasonal parameter controls seasonal smoothness (must be odd), while trend controls trend smoothness. The robust flag makes the algorithm resistant to outliers. Lower values create more flexible components; higher values create smoother ones.
Practical Applications
Decomposition isn’t just academic—it solves real problems.
Deseasonalizing for trend analysis: Remove seasonal patterns to see if your business is actually growing or just experiencing normal seasonal variation.
# Remove seasonality to see adjusted trend
deseasonalized = data - stl_result.seasonal
plt.figure(figsize=(12, 4))
plt.plot(data.index, data.values, label='Original', alpha=0.5)
plt.plot(deseasonalized.index, deseasonalized.values,
label='Deseasonalized', linewidth=2)
plt.legend()
plt.title('Original vs. Deseasonalized Data')
plt.show()
Feature engineering for forecasting: Use decomposed components as features in machine learning models.
# Create features from decomposition
features_df = pd.DataFrame({
'value': data.values.flatten(),
'trend': stl_result.trend.values,
'seasonal': stl_result.seasonal.values,
'day_of_year': data.index.dayofyear,
'month': data.index.month
})
# Use for forecasting model
from sklearn.ensemble import RandomForestRegressor
X = features_df[['trend', 'seasonal', 'day_of_year', 'month']].iloc[:-12]
y = features_df['value'].iloc[:-12]
X_test = features_df[['trend', 'seasonal', 'day_of_year', 'month']].iloc[-12:]
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X, y)
predictions = model.predict(X_test)
Anomaly detection: Outliers in the residual component indicate unusual events that don’t fit normal patterns.
# Identify anomalies in residuals
residual_std = stl_result.resid.std()
anomalies = np.abs(stl_result.resid) > 3 * residual_std
print(f"Found {anomalies.sum()} anomalies")
print(data[anomalies])
Best Practices and Pitfalls
Choose the right period: Your seasonal period must match your data’s actual seasonality. Monthly data with yearly patterns needs period=12. Weekly data needs period=52.
Validate your decomposition: Check residuals for patterns. If residuals show structure, your decomposition missed something.
# Residual diagnostics
from scipy import stats
residuals = stl_result.resid.dropna()
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
# Residual plot
axes[0].scatter(range(len(residuals)), residuals, alpha=0.5)
axes[0].axhline(y=0, color='r', linestyle='--')
axes[0].set_title('Residual Plot')
# Histogram
axes[1].hist(residuals, bins=30, edgecolor='black')
axes[1].set_title('Residual Distribution')
# Q-Q plot
stats.probplot(residuals, dist="norm", plot=axes[2])
axes[2].set_title('Q-Q Plot')
plt.tight_layout()
plt.show()
# Statistical tests
print(f"Residual mean: {residuals.mean():.6f}")
print(f"Residual std: {residuals.std():.4f}")
Don’t over-decompose: If your seasonal component is more volatile than your original data, you’ve over-fitted. Increase smoothing parameters.
Consider data transformation: If you’re unsure between additive and multiplicative, try log-transforming your data first, then use additive decomposition. This often works well for data with exponential growth.
Mind the boundaries: Classical methods lose data at series edges. STL handles this better, but still be cautious with predictions at boundaries.
Time series decomposition transforms opaque data into interpretable components. Master these techniques, and you’ll understand your data’s story—not just its numbers.