On Neural Scaling Laws for Weather Emulation through Continual Training
Shashank Subramanian, Alexander Kiefer, Arnur Nigmetov, Amir Gholami, Dmitriy Morozov, Michael W. Mahoney

TL;DR
This paper investigates neural scaling laws in weather forecasting models using a simple transformer architecture, revealing predictable performance trends and optimal training regimes for efficient resource use.
Contribution
It demonstrates that minimalistic models trained with continual strategies follow predictable scaling laws and can outperform traditional schedules in weather emulation tasks.
Findings
Models follow predictable scaling trends.
Cooldown phases improve forecast accuracy and spectral sharpness.
Identifies compute-optimal training regimes and performance limits.
Abstract
Neural scaling laws, which in some domains can predict the performance of large neural networks as a function of model, data, and compute scale, are the cornerstone of building foundation models in Natural Language Processing and Computer Vision. We study neural scaling in Scientific Machine Learning, focusing on models for weather forecasting. To analyze scaling behavior in as simple a setting as possible, we adopt a minimal, scalable, general-purpose Swin Transformer architecture, and we use continual training with constant learning rates and periodic cooldowns as an efficient training strategy. We show that models trained in this minimalist way follow predictable scaling trends and even outperform standard cosine learning rate schedules. Cooldown phases can be re-purposed to improve downstream performance, e.g., enabling accurate multi-step rollouts over longer forecast horizons as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMeteorological Phenomena and Simulations · Neural Networks and Reservoir Computing · Explainable Artificial Intelligence (XAI)
