Earthformer: Exploring Space-Time Transformers for Earth System Forecasting
Zhihan Gao, Xingjian Shi, Hao Wang, Yi Zhu, Yuyang Wang, Mu Li,, Dit-Yan Yeung

TL;DR
Earthformer introduces a novel space-time Transformer architecture with cuboid attention for improved Earth system forecasting, demonstrating state-of-the-art results on real-world benchmarks like precipitation nowcasting and ENSO prediction.
Contribution
The paper presents Earthformer, a flexible and efficient space-time Transformer with cuboid attention, specifically designed for Earth system forecasting tasks.
Findings
Earthformer outperforms existing models on precipitation forecasting.
It achieves state-of-the-art results on ENSO forecasting.
The cuboid attention mechanism effectively captures spatiotemporal dependencies.
Abstract
Conventionally, Earth system (e.g., weather and climate) forecasting relies on numerical simulation with complex physical models and are hence both expensive in computation and demanding on domain expertise. With the explosive growth of the spatiotemporal Earth observation data in the past decade, data-driven models that apply Deep Learning (DL) are demonstrating impressive potential for various Earth system forecasting tasks. The Transformer as an emerging DL architecture, despite its broad success in other domains, has limited adoption in this area. In this paper, we propose Earthformer, a space-time Transformer for Earth system forecasting. Earthformer is based on a generic, flexible and efficient space-time attention block, named Cuboid Attention. The idea is to decompose the data into cuboids and apply cuboid-level self-attention in parallel. These cuboids are further connected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsComputational Physics and Python Applications · Meteorological Phenomena and Simulations
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Dense Connections · Absolute Position Encodings · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Adam
