Jigsaw: Training Multi-Billion-Parameter AI Weather Models with Optimized Model Parallelism
Deifilia Kieckhefen, Markus G\"otz, Lars H. Heyen, Achim Streit, Charlotte Debus

TL;DR
This paper introduces WeatherMixer, a neural network architecture for weather prediction, and Jigsaw, a novel parallelization scheme enabling efficient training of multi-billion-parameter models on large GPU clusters, significantly improving scaling and performance.
Contribution
The paper presents WeatherMixer for global weather modeling and Jigsaw for optimized model parallelism, enabling training of extremely large models with high efficiency.
Findings
Achieved 9 and 11 PFLOPs peak performance on 256 GPUs.
Reached 68% and 72% scaling efficiency with Jigsaw.
Outperformed previous methods in strong and weak scaling tests.
Abstract
AI-based methods have revolutionized atmospheric forecasting, with recent successes in medium-range forecasting spurring the development of climate foundation models. Accurate modeling of complex atmospheric dynamics at high spatial resolutions and longer lead times requires large neural networks and gigabyte-sized data samples, making accelerator memory and I/O-bandwidth the bottlenecks for model training. We introduce WeatherMixer, a multi-layer-perceptron-based architecture whose workload scales linearly with input size, allowing the model to learn global weather phenomena at accuracies similar to numerical weather prediction. To cope with the computational demand, we propose Jigsaw, a novel model parallelization scheme that employs both domain and tensor parallelism, eliminating memory redundancy. Jigsaw exceeds state-of-the-art performance in strong scaling in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMeteorological Phenomena and Simulations · Hydrological Forecasting Using AI · Tropical and Extratropical Cyclones Research
