How to Implement Croston's Method in Python

Key Insights

Croston’s method solves intermittent demand forecasting by separately modeling demand size and inter-demand intervals, then dividing smoothed demand by smoothed intervals to generate forecasts
The smoothing parameter alpha (typically 0.1-0.2) controls responsiveness, with lower values providing more stability for sporadic demand patterns
The Syntetos-Boylan Approximation (SBA) variant corrects Croston’s positive bias by multiplying forecasts by (1 - alpha/2), improving accuracy for inventory optimization

Introduction to Croston’s Method

Intermittent demand—characterized by periods of zero demand interspersed with occasional non-zero values—breaks traditional forecasting methods. Exponential smoothing and ARIMA models assume continuous demand patterns and perform poorly when 60-70% of observations are zeros, a common scenario in spare parts inventory, slow-moving retail items, and specialized equipment.

Croston’s method, introduced in 1972, tackles this problem with an elegant insight: instead of forecasting the lumpy demand series directly, model two separate components. First, forecast the size of demand when it occurs. Second, forecast the time interval between demand occurrences. The final forecast is the ratio of these smoothed values.

This decomposition handles sparsity naturally. When demand doesn’t occur, only the interval component updates. When demand materializes, both components adjust. The result is stable forecasts that don’t overreact to zeros or occasional spikes.

Understanding the Mathematics

Croston’s method maintains two exponentially smoothed values: average demand size (z) and average inter-demand interval (p). Both use the same smoothing parameter α.

When demand occurs at time t with value d_t > 0:

Update demand size: z_t = α × d_t + (1 - α) × z_{t-1}
Update interval: p_t = α × q + (1 - α) × p_{t-1}, where q is periods since last demand
Forecast: f_t = z_t / p_t

When demand is zero, no updates occur—the previous forecast persists.

The smoothing parameter α controls responsiveness. Lower values (0.1-0.2) work better for highly intermittent series, preventing overreaction to individual demand events. Higher values increase sensitivity but risk instability.

import numpy as np

# Simple demonstration of Croston calculation
demand = np.array([0, 0, 5, 0, 0, 0, 3, 0, 4, 0])
alpha = 0.1

# Initialize
z = demand[demand > 0][0]  # First non-zero demand
p = np.where(demand > 0)[0][0] + 1  # Periods to first demand
forecast = z / p

print(f"Initial: z={z:.2f}, p={p:.2f}, forecast={forecast:.2f}")

# Process subsequent periods
periods_since_demand = 0
for t, d in enumerate(demand[1:], 1):
    periods_since_demand += 1
    if d > 0:
        z = alpha * d + (1 - alpha) * z
        p = alpha * periods_since_demand + (1 - alpha) * p
        forecast = z / p
        periods_since_demand = 0
        print(f"Period {t}: demand={d}, z={z:.2f}, p={p:.2f}, forecast={forecast:.2f}")

Basic Implementation from Scratch

A production-ready implementation needs proper initialization, state management, and prediction capabilities. Here’s a complete class-based approach:

import numpy as np
from typing import Optional

class CrostonsMethod:
    def __init__(self, alpha: float = 0.1):
        """
        Initialize Croston's forecasting method.
        
        Args:
            alpha: Smoothing parameter (0 < alpha < 1)
        """
        if not 0 < alpha < 1:
            raise ValueError("Alpha must be between 0 and 1")
        
        self.alpha = alpha
        self.z = None  # Smoothed demand size
        self.p = None  # Smoothed inter-demand interval
        self.forecast = None
        self.periods_since_demand = 0
        
    def fit(self, demand: np.ndarray) -> 'CrostonsMethod':
        """
        Fit the model to historical demand data.
        
        Args:
            demand: Array of historical demand values
        """
        if len(demand) == 0:
            raise ValueError("Demand series cannot be empty")
        
        # Initialize with first non-zero demand
        nonzero_indices = np.where(demand > 0)[0]
        if len(nonzero_indices) == 0:
            raise ValueError("Demand series must contain at least one non-zero value")
        
        first_demand_idx = nonzero_indices[0]
        self.z = demand[first_demand_idx]
        self.p = first_demand_idx + 1
        self.forecast = self.z / self.p
        self.periods_since_demand = 0
        
        # Process remaining periods
        for t in range(first_demand_idx + 1, len(demand)):
            self.periods_since_demand += 1
            
            if demand[t] > 0:
                # Update both components
                self.z = self.alpha * demand[t] + (1 - self.alpha) * self.z
                self.p = self.alpha * self.periods_since_demand + (1 - self.alpha) * self.p
                self.forecast = self.z / self.p
                self.periods_since_demand = 0
        
        return self
    
    def predict(self, horizon: int = 1) -> np.ndarray:
        """
        Generate forecasts for future periods.
        
        Args:
            horizon: Number of periods to forecast
            
        Returns:
            Array of forecast values
        """
        if self.forecast is None:
            raise ValueError("Model must be fitted before prediction")
        
        # Croston produces flat forecasts
        return np.full(horizon, self.forecast)


# Example usage
demand_series = np.array([0, 0, 5, 0, 0, 0, 3, 0, 4, 0, 0, 6, 0, 0, 2])
model = CrostonsMethod(alpha=0.1)
model.fit(demand_series)
forecasts = model.predict(horizon=5)

print(f"Fitted forecast: {model.forecast:.3f}")
print(f"5-period forecast: {forecasts}")

Handling Edge Cases and Optimizations

The basic implementation needs enhancements for production use. Three critical improvements:

Parameter Optimization: Instead of hardcoding alpha, use cross-validation to find optimal values for your specific demand pattern.

SBA Bias Correction: Croston’s method has a proven positive bias. The Syntetos-Boylan Approximation corrects this by multiplying forecasts by (1 - α/2).

Initialization Strategies: When demand is extremely sparse, initialization significantly impacts accuracy. Consider using multiple non-zero observations for initial values.

from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_absolute_error

class EnhancedCrostonsMethod(CrostonsMethod):
    def __init__(self, alpha: float = 0.1, bias_correction: bool = True):
        super().__init__(alpha)
        self.bias_correction = bias_correction
    
    def predict(self, horizon: int = 1) -> np.ndarray:
        """Generate bias-corrected forecasts."""
        base_forecast = super().predict(horizon)
        
        if self.bias_correction:
            # Apply SBA correction
            correction_factor = 1 - (self.alpha / 2)
            return base_forecast * correction_factor
        
        return base_forecast


def optimize_alpha(demand: np.ndarray, alpha_range: np.ndarray = None) -> float:
    """
    Find optimal alpha using time series cross-validation.
    
    Args:
        demand: Historical demand series
        alpha_range: Array of alpha values to test
        
    Returns:
        Optimal alpha value
    """
    if alpha_range is None:
        alpha_range = np.arange(0.05, 0.5, 0.05)
    
    tscv = TimeSeriesSplit(n_splits=3)
    best_alpha = alpha_range[0]
    best_mae = float('inf')
    
    for alpha in alpha_range:
        maes = []
        
        for train_idx, test_idx in tscv.split(demand):
            train, test = demand[train_idx], demand[test_idx]
            
            try:
                model = EnhancedCrostonsMethod(alpha=alpha)
                model.fit(train)
                predictions = model.predict(horizon=len(test))
                
                # Calculate MAE only on non-zero actual demand
                nonzero_mask = test > 0
                if nonzero_mask.any():
                    mae = mean_absolute_error(test[nonzero_mask], predictions[nonzero_mask])
                    maes.append(mae)
            except ValueError:
                continue
        
        if maes:
            avg_mae = np.mean(maes)
            if avg_mae < best_mae:
                best_mae = avg_mae
                best_alpha = alpha
    
    return best_alpha


# Example: Optimize and fit
optimal_alpha = optimize_alpha(demand_series)
print(f"Optimal alpha: {optimal_alpha:.3f}")

optimized_model = EnhancedCrostonsMethod(alpha=optimal_alpha, bias_correction=True)
optimized_model.fit(demand_series)
print(f"SBA-corrected forecast: {optimized_model.predict(1)[0]:.3f}")

Practical Application with Real Data

Let’s apply Croston’s method to a realistic scenario: forecasting spare parts demand for maintenance operations.

import pandas as pd
from sklearn.metrics import mean_squared_error, mean_absolute_error

# Simulate realistic spare parts demand
np.random.seed(42)
periods = 100
demand_probability = 0.15  # 15% chance of demand each period
demand_size_mean = 3

demand_data = np.where(
    np.random.random(periods) < demand_probability,
    np.random.poisson(demand_size_mean, periods),
    0
)

# Split into train/test
train_size = int(0.8 * len(demand_data))
train_demand = demand_data[:train_size]
test_demand = demand_data[train_size:]

# Fit model with optimized parameters
optimal_alpha = optimize_alpha(train_demand)
model = EnhancedCrostonsMethod(alpha=optimal_alpha, bias_correction=True)
model.fit(train_demand)

# Generate forecasts
test_forecasts = model.predict(horizon=len(test_demand))

# Evaluate accuracy
# For intermittent demand, focus on non-zero periods
nonzero_mask = test_demand > 0
if nonzero_mask.any():
    mae = mean_absolute_error(test_demand[nonzero_mask], test_forecasts[nonzero_mask])
    rmse = np.sqrt(mean_squared_error(test_demand[nonzero_mask], test_forecasts[nonzero_mask]))
    
    print(f"Test Set Performance:")
    print(f"  MAE (non-zero periods): {mae:.3f}")
    print(f"  RMSE (non-zero periods): {rmse:.3f}")
    print(f"  Forecast value: {test_forecasts[0]:.3f}")
    print(f"  Actual demand events: {nonzero_mask.sum()}/{len(test_demand)}")

Comparison with Alternative Methods

Croston’s method isn’t the only option for intermittent demand. The TSB (Teunter-Syntetos-Babai) method and simple moving averages offer alternatives.

def simple_moving_average(demand: np.ndarray, window: int = 5) -> float:
    """Calculate simple moving average, ignoring structure."""
    return np.mean(demand[-window:]) if len(demand) >= window else np.mean(demand)


def compare_methods(demand: np.ndarray, train_size: int = None):
    """Compare forecasting methods on intermittent demand."""
    if train_size is None:
        train_size = int(0.8 * len(demand))
    
    train = demand[:train_size]
    test = demand[train_size:]
    
    # Croston's method
    croston_alpha = optimize_alpha(train)
    croston = EnhancedCrostonsMethod(alpha=croston_alpha)
    croston.fit(train)
    croston_forecast = croston.predict(len(test))
    
    # Simple moving average
    sma_forecast = np.full(len(test), simple_moving_average(train, window=10))
    
    # Evaluate on non-zero periods
    nonzero_mask = test > 0
    if nonzero_mask.any():
        results = {
            'Croston (SBA)': {
                'MAE': mean_absolute_error(test[nonzero_mask], croston_forecast[nonzero_mask]),
                'RMSE': np.sqrt(mean_squared_error(test[nonzero_mask], croston_forecast[nonzero_mask]))
            },
            'Moving Average': {
                'MAE': mean_absolute_error(test[nonzero_mask], sma_forecast[nonzero_mask]),
                'RMSE': np.sqrt(mean_squared_error(test[nonzero_mask], sma_forecast[nonzero_mask]))
            }
        }
        
        return pd.DataFrame(results).T


# Run comparison
comparison_results = compare_methods(demand_data)
print("\nMethod Comparison:")
print(comparison_results)

Conclusion and Best Practices

Croston’s method excels when demand is genuinely intermittent—typically when more than 50% of periods have zero demand and coefficient of variation exceeds 1.0. For less sparse data, traditional exponential smoothing often performs better.

Key recommendations for production deployment:

Parameter Selection: Start with alpha between 0.1 and 0.2. Use cross-validation for fine-tuning, but avoid overfitting to recent patterns in sparse data.

Bias Correction: Always enable SBA correction unless you have specific reasons not to. The bias reduction typically improves inventory optimization decisions.

Monitoring: Track forecast accuracy separately for zero and non-zero periods. Croston’s method should excel at predicting demand size when it occurs, not just overall averages.

Integration: Combine Croston forecasts with safety stock calculations that account for intermittency. Standard deviation-based safety stock formulas don’t work well with intermittent demand—use quantile-based approaches instead.

When Not to Use: If demand occurs in most periods (>80%) or follows clear seasonal patterns, stick with traditional methods like Holt-Winters or SARIMA. Croston’s simplicity becomes a limitation when richer temporal structure exists.

The implementation provided here handles the core use case efficiently. For large-scale applications with thousands of SKUs, vectorize the calculations or use existing libraries like statsforecast, which includes optimized Croston implementations.