TL;DR
The paper introduces BSA-TNP, a scalable neural process architecture that achieves high accuracy and efficiency in modeling complex spatiotemporal data, with translation invariance and high-dimensional support.
Contribution
It proposes BSA-TNP, a novel architecture combining kernel regression, group-invariant attention biases, and memory-efficient attention for scalable, accurate spatiotemporal inference.
Findings
BSA-TNP matches or exceeds existing models' accuracy.
BSA-TNP trains faster than comparable models.
BSA-TNP scales to over 1 million test points in under a minute.
Abstract
Neural Processes (NPs) are a rapidly evolving class of models designed to directly model the posterior predictive distribution of stochastic processes. While early architectures were developed primarily as a scalable alternative to Gaussian Processes (GPs), modern NPs tackle far more complex and data-hungry applications spanning geology, epidemiology, climate, and robotics. These applications have placed increasing pressure on the scalability of these models, with many architectures compromising accuracy for scalability. In this paper, we demonstrate that this trade-off is often unnecessary, particularly when modeling fully or partially translation-invariant processes. We propose a versatile new architecture, the Biased Scan Attention Transformer Neural Process (BSA-TNP), which introduces Kernel Regression Blocks (KRBlocks), group-invariant attention biases, and memory-efficient Biased…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
