When Attention Beats Fourier: Multi-Scale Transformers for PDE Solving on Irregular Domains
Brandon Yee, Pairie Koh, Jack Rodriguez, Mihir Tekal

TL;DR
This paper introduces the Multi-Scale Attention Transformer ( extbackslash msat{}) for solving PDEs, demonstrating superior accuracy and efficiency over existing methods on complex geometries and providing insights into architecture selection and regularization effects.
Contribution
The paper presents extbackslash msat{}, a novel transformer-based architecture for PDE solving, with comprehensive empirical evaluation and theoretical analysis guiding architecture choice and regularization.
Findings
extbackslash msat{} achieves state-of-the-art accuracy on complex geometry PDE problems.
extbackslash msat{} significantly reduces inference time compared to Mamba-NO.
Physics regularization improves performance on diffusion problems but harms chaotic regimes.
Abstract
We study the problem of \emph{architecture selection} for deep learning models trained to solve partial differential equations (PDEs), asking when transformer-based architectures with learned attention outperform Fourier-domain neural operators. We introduce the \textbf{Multi-Scale Attention Transformer} (\msat{}), a deep learning architecture that encodes spatiotemporal solution histories as token sequences and trains end-to-end via a composite supervised objective with optional physics-informed regularization terms. We conduct a comprehensive empirical evaluation against nine baselines -- including physics-informed neural networks (PINNs), neural operators (FNO, DeepONet, GNOT), and state-space models (Mamba-NO) -- across five benchmark problems from the PINNacle suite, using identical train/test splits and reference data for all methods. \msat{} achieves state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
