Simple yet Effective: Low-Rank Spatial Attention for Neural Operators

Zherui Yang; Haiyang Xin; Tao Du; Ligang Liu

arXiv:2604.03582·cs.LG·April 7, 2026

Simple yet Effective: Low-Rank Spatial Attention for Neural Operators

Zherui Yang, Haiyang Xin, Tao Du, Ligang Liu

PDF

TL;DR

This paper introduces Low-Rank Spatial Attention (LRSA), a simple, Transformer-based module for neural operators that efficiently models global interactions in PDEs, achieving significant accuracy improvements.

Contribution

It unifies global interaction modeling under a low-rank template and presents LRSA, a straightforward, hardware-compatible module built from standard Transformer components.

Findings

01

Achieves over 17% error reduction compared to second-best methods.

02

Maintains stability and efficiency in mixed-precision training.

03

Simple construction suffices for high accuracy in neural operators.

Abstract

Neural operators have emerged as data-driven surrogates for solving partial differential equations (PDEs), and their success hinges on efficiently modeling the long-range, global coupling among spatial points induced by the underlying physics. In many PDE regimes, the induced global interaction kernels are empirically compressible, exhibiting rapid spectral decay that admits low-rank approximations. We leverage this observation to unify representative global mixing modules in neural operators under a shared low-rank template: compressing high-dimensional pointwise features into a compact latent space, processing global interactions within it, and reconstructing the global context back to spatial points. Guided by this view, we introduce Low-Rank Spatial Attention (LRSA) as a clean and direct instantiation of this template. Crucially, unlike prior approaches that often rely on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.