Bayesian Transformer for Probabilistic Load Forecasting in Smart Grids
Sajib Debnath, Md. Uzzal Mia

TL;DR
This paper introduces a Bayesian Transformer model for probabilistic load forecasting in smart grids, integrating multiple uncertainty mechanisms to produce well-calibrated, sharp prediction intervals that outperform existing methods across various datasets and conditions.
Contribution
It presents the first application of Bayesian attention in load forecasting, combining three uncertainty mechanisms within a Transformer architecture for improved probabilistic predictions.
Findings
Achieves state-of-the-art CRPS scores on benchmark datasets.
Maintains high calibration and sharpness during extreme weather events.
Outperforms deep ensembles and deterministic models in probabilistic accuracy.
Abstract
The reliable operation of modern power grids requires probabilistic load forecasts with well-calibrated uncertainty estimates. However, existing deep learning models produce overconfident point predictions that fail catastrophically under extreme weather distributional shifts. This study proposes a Bayesian Transformer (BT) framework that integrates three complementary uncertainty mechanisms into a PatchTST backbone: Monte Carlo Dropout for epistemic parameter uncertainty, variational feed-forward layers with log-uniform weight priors, and stochastic attention with learnable Gaussian noise perturbations on pre-softmax logits, representing, to the best of our knowledge, the first application of Bayesian attention to probabilistic load forecasting. A seven-level multi-quantile pinball-loss prediction head and post-training isotonic regression calibration produce sharp, near-nominally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnergy Load and Power Forecasting · Optimal Power Flow Distribution · Integrated Energy Systems Optimization
