Multi-model time-series forecasting with Bayesian Optimisation — SARIMA, Random Forest, XGBoost, LightGBM, Prophet, LSTM, QuantileML, and weighted ensembles.
Repo: github.com/TomCardeLo/boa-sarima-forecaster
The problem
Demand planners managing hundreds of SKUs across multiple markets face the same headache: every series needs different hyperparameters. Manual tuning doesn’t scale, and grid search burns CPU for diminishing returns.
BOA Forecaster solves this with Bayesian Optimisation (Optuna TPE) — a sample-efficient search that learns from past evaluations and focuses on promising regions of the parameter space.
What it does
Given a time series (monthly, weekly, daily, or hourly), the library:
- Cleans the data — fills calendar gaps, removes zero-demand groups, clips outliers with a weighted moving-average smoother.
- Picks a model — SARIMA, Random Forest, XGBoost, LightGBM, Prophet, LSTM, or a weighted ensemble of any combination.
- Tunes it automatically — Optuna explores the search space, scoring each trial with
0.7·sMAPE + 0.3·RMSLE(configurable). - Validates it — expanding-window walk-forward CV, with baselines (Seasonal Naïve, ETS, AutoARIMA) for honest comparison.
- Forecasts — point predictions, plus optional probabilistic intervals via
QuantileMLSpec.
How it’s built
The core is a small protocol — ModelSpec — that any model implements in roughly 50 lines. That single abstraction is what makes the framework pluggable: SARIMA, tree boosters, and a PyTorch LSTM all share the same optimiser and validation loop.
data_loader → preprocessor → standardization → features
↓
ModelSpec
↓
optimizer (Optuna TPE)
↓
validation (walk-forward CV)
↓
forecast (+ optional bias correction)
Everything is configurable through a single config.yaml, and the public API is exported from one entry point so callers don’t need to know which model family they’re using.
Things I’m proud of
- Soft-failure optimiser: when an Optuna trial crashes, the result carries a fallback flag instead of raising — production pipelines stay green.
- Vectorised outlier clipping:
weighted_moving_stats_seriesis 18–130× faster than the row-by-row version, mathematically equivalent. - Bucketed metrics:
hit_rate_weightedandf1_by_bucketfor tiered accuracy where landing in the right band matters more than the absolute error (think air quality indexes). - Preset packs:
presets/air_quality.pyships ICA / US EPA AQI edges out of the box.
Stack
Python 3.9+, statsmodels, scikit-learn, XGBoost, LightGBM, Prophet, PyTorch, Optuna, Pydantic. Tested across Python 3.9–3.11 in CI.