The Recipe Matters More Than the Kitchen:Mathematical Foundations of the AI Weather Prediction Pipeline
Piyush Garg, Diana R. Gergel, Andrew E. Shao, and Galen J. Yacalis

TL;DR
This paper develops a comprehensive mathematical framework for AI weather prediction, emphasizing the importance of training methodology, loss functions, and data diversity over architecture alone, supported by empirical validation across diverse models.
Contribution
It introduces a unified theoretical framework for the entire AI weather prediction pipeline and provides empirical validation of its predictions across multiple models.
Findings
Spectral energy loss at high wavenumbers in MSE-trained models
Forecast errors are largely shared across different architectures
Models systematically underestimate extreme events with bias increasing linearly during record-breaking scenarios
Abstract
AI weather prediction has advanced rapidly, yet no unified mathematical framework explains what determines forecast skill. Existing theory addresses specific architectural choices rather than the learning pipeline as a whole, while operational evidence from 2023-2026 demonstrates that training methodology, loss function design, and data diversity matter at least as much as architecture selection. This paper makes two interleaved contributions. Theoretically, we construct a framework rooted in approximation theory on the sphere, dynamical systems theory, information theory, and statistical learning theory that treats the complete learning pipeline (architecture, loss function, training strategy, data distribution) rather than architecture alone. We establish a Learning Pipeline Error Decomposition showing that estimation error (loss- and data-dependent) dominates approximation error…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
