Spatially-informed transformers: Injecting geostatistical covariance biases into self-attention for spatio-temporal forecasting
Yuri Calleo

TL;DR
This paper introduces a hybrid transformer architecture that incorporates geostatistical covariance biases into self-attention, improving spatio-temporal forecasting by combining probabilistic rigor with deep learning flexibility.
Contribution
It proposes a spatially-informed transformer that injects geostatistical inductive bias into self-attention via a learnable covariance kernel, bridging geostatistics and deep learning.
Findings
Outperforms state-of-the-art graph neural networks on benchmarks.
Recovers true spatial decay parameters end-to-end.
Provides well-calibrated probabilistic forecasts.
Abstract
The modeling of high-dimensional spatio-temporal processes presents a fundamental dichotomy between the probabilistic rigor of classical geostatistics and the flexible, high-capacity representations of deep learning. While Gaussian processes offer theoretical consistency and exact uncertainty quantification, their prohibitive computational scaling renders them impractical for massive sensor networks. Conversely, modern transformer architectures excel at sequence modeling but inherently lack a geometric inductive bias, treating spatial sensors as permutation-invariant tokens without a native understanding of distance. In this work, we propose a spatially-informed transformer, a hybrid architecture that injects a geostatistical inductive bias directly into the self-attention mechanism via a learnable covariance kernel. By formally decomposing the attention structure into a stationary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoil Geostatistics and Mapping · Gaussian Processes and Bayesian Inference · Traffic Prediction and Management Techniques
