Transolver is a Linear Transformer: Revisiting Physics-Attention through the Lens of Linear Attention
Wenjie Hu, Sidun Liu, Peng Qiao, Zhenglun Sun, Yong Dou

TL;DR
This paper reinterprets Physics-Attention as a form of linear attention, redesigns it into a more efficient Linear Attention Neural Operator, and demonstrates state-of-the-art results on PDE benchmarks with reduced complexity.
Contribution
It introduces a novel reformulation of Physics-Attention as linear attention and proposes LinearNO, achieving better performance with fewer parameters and lower computational cost.
Findings
State-of-the-art performance on six PDE benchmarks.
40% reduction in model parameters.
36% reduction in computational cost.
Abstract
Recent advances in Transformer-based Neural Operators have enabled significant progress in data-driven solvers for Partial Differential Equations (PDEs). Most current research has focused on reducing the quadratic complexity of attention to address the resulting low training and inference efficiency. Among these works, Transolver stands out as a representative method that introduces Physics-Attention to reduce computational costs. Physics-Attention projects grid points into slices for slice attention, then maps them back through deslicing. However, we observe that Physics-Attention can be reformulated as a special case of linear attention, and that the slice attention may even hurt the model performance. Based on these observations, we argue that its effectiveness primarily arises from the slice and deslice operations rather than interactions between slices. Building on this insight, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsModel Reduction and Neural Networks · Machine Learning in Materials Science · Ferroelectric and Negative Capacitance Devices
