How to Perform Seasonal Adjustment in Python

Key Insights

Seasonal adjustment removes predictable patterns from time series data, revealing underlying trends and making year-over-year comparisons meaningful—essential for economic indicators, retail analytics, and demand forecasting.
STL decomposition offers the most flexibility for modern applications, handling outliers and changing seasonal patterns better than classical methods, while X-13ARIMA-SEATS remains the gold standard for official statistics.
Always validate your seasonal adjustment by examining residuals for remaining patterns; a properly adjusted series should show no autocorrelation in the seasonal component and normally distributed residuals.

Introduction to Seasonal Adjustment

Time series data often contains predictable patterns that repeat at fixed intervals—monthly sales spikes during holidays, quarterly earnings cycles, or weekly traffic patterns. These seasonal effects obscure the underlying trends that matter for decision-making. When retail executives see a 20% jump in November sales, they need to know whether business is genuinely improving or if it’s just typical holiday shopping.

Seasonal adjustment separates these recurring patterns from the true signal. Without it, you’ll mistake normal seasonal variation for meaningful change, leading to poor forecasts and misguided business decisions. Government agencies seasonally adjust employment data, GDP figures, and inflation rates before publishing them. E-commerce companies adjust sales metrics to identify real growth. Energy providers adjust consumption data to plan capacity independent of weather-driven seasonal demand.

The goal is straightforward: isolate and remove the seasonal component so you can analyze the trend and irregular variations separately.

Understanding Seasonal Decomposition

Time series decomposition breaks data into three components: trend (long-term movement), seasonal (repeating patterns), and residual (irregular fluctuations). You combine these components using either an additive or multiplicative model.

Additive model: Y(t) = T(t) + S(t) + R(t)

Use this when seasonal variations remain constant in absolute terms regardless of the trend level. If sales always spike by $10,000 in December whether your baseline is $50,000 or $100,000, that’s additive.

Multiplicative model: Y(t) = T(t) × S(t) × R(t)

Use this when seasonal variations scale with the trend. If December sales are always 150% of the average month, that’s multiplicative. Most real-world data exhibits multiplicative seasonality—percentage changes matter more than absolute changes.

Let’s decompose monthly retail sales data:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

# Generate sample retail sales data with trend and seasonality
np.random.seed(42)
dates = pd.date_range('2018-01-01', '2023-12-31', freq='MS')
trend = np.linspace(100, 150, len(dates))
seasonal = 10 * np.sin(np.arange(len(dates)) * 2 * np.pi / 12)
noise = np.random.normal(0, 3, len(dates))
sales = trend + seasonal + noise

df = pd.DataFrame({'sales': sales}, index=dates)

# Perform seasonal decomposition
decomposition = seasonal_decompose(df['sales'], model='additive', period=12)

# Plot components
fig, axes = plt.subplots(4, 1, figsize=(12, 10))
df['sales'].plot(ax=axes[0], title='Original Series')
decomposition.trend.plot(ax=axes[1], title='Trend')
decomposition.seasonal.plot(ax=axes[2], title='Seasonal')
decomposition.resid.plot(ax=axes[3], title='Residual')
plt.tight_layout()
plt.show()

# Extract seasonally adjusted series
adjusted_sales = df['sales'] - decomposition.seasonal

This basic decomposition uses moving averages internally. The trend line smooths out short-term fluctuations, the seasonal component captures the repeating pattern, and residuals show what’s left over.

Classical Seasonal Adjustment Methods

Classical decomposition uses centered moving averages to estimate the trend, then extracts seasonality from the detrended series. While simple, it has limitations: it loses observations at the beginning and end of the series, can’t handle missing data, and assumes constant seasonal patterns.

Here’s a manual implementation to understand the mechanics:

def classical_seasonal_adjustment(series, period=12):
    """
    Manually perform seasonal adjustment using moving averages
    """
    # Step 1: Calculate centered moving average (trend)
    trend = series.rolling(window=period, center=True).mean()
    
    # Step 2: Detrend the series
    detrended = series - trend
    
    # Step 3: Calculate seasonal component (average for each period)
    seasonal = detrended.groupby(detrended.index.month).transform('mean')
    
    # Step 4: Remove seasonal component
    adjusted = series - seasonal
    
    # Step 5: Calculate residuals
    residual = series - trend - seasonal
    
    return {
        'adjusted': adjusted,
        'trend': trend,
        'seasonal': seasonal,
        'residual': residual
    }

# Apply to our data
components = classical_seasonal_adjustment(df['sales'], period=12)

print("Original December value:", df['sales']['2022-12'])
print("Seasonally adjusted December value:", components['adjusted']['2022-12'])
print("Seasonal factor for December:", components['seasonal']['2022-12'])

This approach works for stable, well-behaved data but breaks down with outliers or evolving seasonal patterns. That’s where modern methods excel.

STL Decomposition (Seasonal and Trend decomposition using Loess)

STL uses locally weighted regression (LOESS) to decompose time series, offering significant advantages: it handles any type of seasonality, is robust to outliers, and allows the seasonal component to change over time. It’s the best choice for most modern applications.

Key parameters:

seasonal: Controls how quickly the seasonal component can change (odd integer)
trend: Controls smoothness of the trend component
robust: Whether to use robust fitting (recommended with outliers)

from statsmodels.tsa.seasonal import STL

# Load classic airline passenger data
passengers = pd.read_csv('airline_passengers.csv', index_col='Month', parse_dates=True)

# STL decomposition with different parameter settings
stl_default = STL(passengers['Passengers'], seasonal=13).fit()
stl_flexible = STL(passengers['Passengers'], seasonal=7, trend=15).fit()
stl_robust = STL(passengers['Passengers'], seasonal=13, robust=True).fit()

# Compare results
fig, axes = plt.subplots(3, 1, figsize=(12, 8))

axes[0].plot(passengers.index, passengers['Passengers'], label='Original')
axes[0].plot(passengers.index, stl_default.trend + stl_default.resid, label='STL Adjusted (default)')
axes[0].legend()
axes[0].set_title('Default STL (seasonal=13)')

axes[1].plot(passengers.index, stl_flexible.trend + stl_flexible.resid, label='STL Adjusted (flexible)')
axes[1].legend()
axes[1].set_title('Flexible STL (seasonal=7, trend=15)')

axes[2].plot(passengers.index, stl_robust.trend + stl_robust.resid, label='STL Adjusted (robust)')
axes[2].legend()
axes[2].set_title('Robust STL')

plt.tight_layout()

# Extract adjusted series
adjusted_passengers = passengers['Passengers'] - stl_default.seasonal

Lower seasonal values allow the seasonal pattern to change more quickly—useful for evolving businesses. Higher values assume more stable seasonality. For most applications, start with seasonal=13 for monthly data (must be odd) and adjust based on your domain knowledge.

X-13ARIMA-SEATS Method

X-13ARIMA-SEATS is the industry standard developed by the U.S. Census Bureau. It combines seasonal adjustment with ARIMA modeling for more sophisticated handling of calendar effects, outliers, and trading day adjustments. Government statistical agencies worldwide use it for official economic indicators.

The method automatically identifies and adjusts for:

Moving holidays (Easter, Lunar New Year)
Trading day effects (number of weekdays vs. weekends)
Outliers and structural breaks
Calendar effects (leap years, different month lengths)

# Install: pip install seasonal
from seasonal import fit_seasons
import pandas as pd

# Load economic indicator data (e.g., unemployment rate)
unemployment = pd.read_csv('unemployment.csv', index_col='date', parse_dates=True)

# Perform X-13 seasonal adjustment
# Note: Requires X-13ARIMA-SEATS binary installed separately
try:
    from statsmodels.tsa.x13 import x13_arima_analysis
    
    # Run X-13 adjustment
    x13_result = x13_arima_analysis(
        unemployment['rate'],
        trading=True,  # Adjust for trading days
        outlier=True,  # Detect and adjust outliers
        forecast_periods=12  # Include forecasts
    )
    
    adjusted_unemployment = x13_result.seasadj
    
    # Compare original vs adjusted
    fig, ax = plt.subplots(figsize=(12, 6))
    unemployment['rate'].plot(ax=ax, label='Original', alpha=0.7)
    adjusted_unemployment.plot(ax=ax, label='Seasonally Adjusted', linewidth=2)
    ax.legend()
    ax.set_title('Unemployment Rate: Original vs. Seasonally Adjusted')
    
except ImportError:
    print("X-13 requires separate installation of Census Bureau software")
    # Fallback to STL for demonstration
    stl_unemp = STL(unemployment['rate'], seasonal=13, robust=True).fit()
    adjusted_unemployment = unemployment['rate'] - stl_unemp.seasonal

X-13 requires additional setup (installing the Census Bureau’s binary), which can be challenging. For most applications, STL provides 90% of the value with 10% of the complexity. Use X-13 when you need official-quality adjustments or must handle complex calendar effects.

Evaluating and Validating Results

Never trust seasonal adjustment blindly. Always validate that you’ve actually removed seasonality without distorting the underlying signal.

from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.stats.diagnostic import acorr_ljungbox

def validate_seasonal_adjustment(original, adjusted, seasonal_period=12):
    """
    Comprehensive validation of seasonal adjustment
    """
    # 1. Visual inspection
    fig, axes = plt.subplots(3, 2, figsize=(14, 10))
    
    # Original vs adjusted
    axes[0, 0].plot(original.index, original, label='Original', alpha=0.6)
    axes[0, 0].plot(adjusted.index, adjusted, label='Adjusted', linewidth=2)
    axes[0, 0].legend()
    axes[0, 0].set_title('Original vs. Adjusted Series')
    
    # Seasonal component removed
    seasonal = original - adjusted
    axes[0, 1].plot(seasonal.index, seasonal)
    axes[0, 1].set_title('Removed Seasonal Component')
    
    # ACF of original
    plot_acf(original.dropna(), lags=36, ax=axes[1, 0])
    axes[1, 0].set_title('ACF: Original Series')
    
    # ACF of adjusted
    plot_acf(adjusted.dropna(), lags=36, ax=axes[1, 1])
    axes[1, 1].set_title('ACF: Adjusted Series')
    
    # Residuals distribution
    residuals = adjusted - adjusted.rolling(window=seasonal_period).mean()
    axes[2, 0].hist(residuals.dropna(), bins=30, edgecolor='black')
    axes[2, 0].set_title('Distribution of Residuals')
    
    # Q-Q plot
    from scipy import stats
    stats.probplot(residuals.dropna(), dist="norm", plot=axes[2, 1])
    axes[2, 1].set_title('Q-Q Plot')
    
    plt.tight_layout()
    
    # 2. Statistical tests
    # Ljung-Box test for autocorrelation at seasonal lags
    lb_test = acorr_ljungbox(adjusted.dropna(), lags=[seasonal_period], return_df=True)
    
    print("Validation Results:")
    print(f"Ljung-Box test p-value at lag {seasonal_period}: {lb_test['lb_pvalue'].values[0]:.4f}")
    print(f"(p > 0.05 indicates no remaining seasonality)")
    
    return fig

# Validate our STL adjustment
validate_seasonal_adjustment(df['sales'], adjusted_sales, seasonal_period=12)

Good seasonal adjustment should show:

No significant spikes at seasonal lags in the ACF plot of adjusted data
Ljung-Box test p-value > 0.05 (no autocorrelation)
Normally distributed residuals
Removed seasonal component showing clear, stable patterns

Practical Applications and Best Practices

Choose your method based on your needs:

STL: Default choice for most applications. Flexible, robust, handles outliers.
Classical decomposition: Quick exploratory analysis, educational purposes.
X-13ARIMA-SEATS: Official statistics, complex calendar effects, regulatory requirements.

Handle edge cases:

For multiple seasonal patterns (daily + weekly + yearly), use MSTL (Multiple STL):

from statsmodels.tsa.seasonal import MSTL

# Example: hourly data with daily and weekly seasonality
mstl = MSTL(hourly_data, periods=[24, 168]).fit()  # 24 hours, 168 hours/week
adjusted_hourly = hourly_data - mstl.seasonal

End-to-end pipeline:

def seasonal_adjustment_pipeline(data, target_col, seasonal_period=12, method='stl'):
    """
    Complete pipeline: load, adjust, forecast, compare
    """
    # 1. Seasonal adjustment
    if method == 'stl':
        decomposition = STL(data[target_col], seasonal=seasonal_period+1, robust=True).fit()
        adjusted = data[target_col] - decomposition.seasonal
    
    # 2. Split train/test
    train_size = int(len(data) * 0.8)
    train_orig = data[target_col][:train_size]
    train_adj = adjusted[:train_size]
    test = data[target_col][train_size:]
    
    # 3. Simple forecast comparison (naive: last year's value)
    from sklearn.metrics import mean_absolute_error, mean_squared_error
    
    # Forecast without adjustment
    forecast_orig = train_orig.iloc[-seasonal_period:].values
    
    # Forecast with adjustment + re-seasonalize
    forecast_adj = train_adj.iloc[-seasonal_period:].values + decomposition.seasonal[-seasonal_period:].values
    
    # 4. Evaluate
    mae_orig = mean_absolute_error(test[:seasonal_period], forecast_orig)
    mae_adj = mean_absolute_error(test[:seasonal_period], forecast_adj)
    
    print(f"MAE without adjustment: {mae_orig:.2f}")
    print(f"MAE with adjustment: {mae_adj:.2f}")
    print(f"Improvement: {((mae_orig - mae_adj) / mae_orig * 100):.1f}%")
    
    return adjusted, decomposition

# Run pipeline
adjusted_series, decomp = seasonal_adjustment_pipeline(df, 'sales', seasonal_period=12)

Seasonal adjustment is not optional for time series analysis—it’s fundamental. Start with STL, validate thoroughly, and always compare results with and without adjustment to understand its impact on your specific use case.