ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
Xiao Wang, Jong-Youl Choi, Takuya Kurihaya, Isaac Lyngaas, Hong-Jun Yoon, Xi Xiao, David Pugmire, Ming Fan, Nasik M. Nafi, Aristeidis Tsaris, Ashwin M. Aji, Maliha Hossain, Mohamed Wahib, Dali Wang, Peter Thornton, Prasanna Balaprakash, Moetasim Ashfaq, Dan Lu

TL;DR
ORBIT-2 is a scalable foundation model that leverages innovative architectures and algorithms to enable high-resolution global climate downscaling with unprecedented efficiency and accuracy, addressing limitations of existing AI methods.
Contribution
The paper introduces ORBIT-2, a novel scalable climate downscaling model with a lightweight architecture and a linear self-attention algorithm, enabling processing of massive sequences at exascale.
Findings
Achieves up to 4.1 exaFLOPS throughput on 65,536 GPUs.
Supports 0.9 km global resolution downscaling.
Attains $R^2$ scores of 0.98--0.99 on benchmark data.
Abstract
Sparse observations and coarse-resolution climate models limit effective regional decision-making, underscoring the need for robust downscaling. However, existing AI methods struggle with generalization across variables and geographies and are constrained by the quadratic complexity of Vision Transformer (ViT) self-attention. We introduce ORBIT-2, a scalable foundation model for global, hyper-resolution climate downscaling. ORBIT-2 incorporates two key innovations: (1) Residual Slim ViT (Reslim), a lightweight architecture with residual learning and Bayesian regularization for efficient, robust prediction; and (2) TILES, a tile-wise sequence scaling algorithm that reduces self-attention complexity from quadratic to linear, enabling long-sequence processing and massive parallelism. ORBIT-2 scales to 10 billion parameters across 65,536 GPUs, achieving up to 4.1 exaFLOPS sustained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote Sensing in Agriculture · Cryospheric studies and observations · Urban Heat Island Mitigation
MethodsLinear Layer · Multi-Head Attention · Dense Connections · Adam · Attention Is All You Need · Dropout · Vision Transformer · Layer Normalization · Position-Wise Feed-Forward Layer · Byte Pair Encoding
