Global Attention with Linear Complexity for Exascale Generative Data Assimilation in Earth System Prediction
Xiao Wang, Zezhong Zhang, Isaac Lyngaas, Hong-Jun Yoon, Jong-Youl Choi, Siming Liang, Janet Wang, Hristo G. Chipilski, Ashwin M. Aji, Feng Bao, Peter Jan van Leeuwen, Dan Lu, Guannan Zhang

TL;DR
This paper presents a scalable, GPU-efficient generative data assimilation framework using a novel linear-complexity global attention transformer, enabling km-scale Earth system modeling at exascale performance.
Contribution
It introduces STORM, a spatiotemporal transformer with linear attention complexity, and demonstrates its application to large-scale Earth system prediction on supercomputers.
Findings
Achieved 63% strong scaling efficiency on 32,768 GPUs.
Reaches 1.6 ExaFLOP sustained performance.
Scales to 20 billion spatiotemporal tokens for km-scale modeling.
Abstract
Accurate weather and climate prediction relies on data assimilation (DA), which estimates the Earth system state by integrating observations with models. While exascale computing has significantly advanced earth simulation, scalable and accurate inference of the Earth system state remains a fundamental bottleneck, limiting uncertainty quantification and prediction of extreme events. We introduce a unified one-stage generative DA framework that reformulates assimilation as Bayesian posterior sampling, replacing the conventional forecast-update cycle with compute-dense, GPU-efficient inference. At the core is STORM, a novel spatiotemporal transformer with a global attention linear-complexity scaling algorithm that breaks the quadratic attention barrier. On 32,768 GPUs of the Frontier supercomputer, our method achieves 63% strong scaling efficiency and 1.6 ExaFLOP sustained performance. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
