Continuum Attention for Neural Operators

Edoardo Calvello; Nikola B. Kovachki; Matthew E. Levine; Andrew M. Stuart

arXiv:2406.06486·cs.LG·December 23, 2025·5 cites

Continuum Attention for Neural Operators

Edoardo Calvello, Nikola B. Kovachki, Matthew E. Levine, Andrew M. Stuart

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel function space formulation of attention mechanisms, enabling the design of transformer neural operators with universal approximation capabilities for mappings between function spaces.

Contribution

It formulates attention as an operator in infinite-dimensional function spaces and proves a universal approximation theorem for transformer neural operators.

Findings

01

First universal approximation theorem for transformer neural operators.

02

Efficient attention-based architectures for multi-dimensional domains.

03

Numerical results demonstrating effectiveness on operator learning problems.

Abstract

Transformers, and the attention mechanism in particular, have become ubiquitous in machine learning. Their success in modeling nonlocal, long-range correlations has led to their widespread adoption in natural language processing, computer vision, and time series problems. Neural operators, which map spaces of functions into spaces of functions, are necessarily both nonlinear and nonlocal if they are universal; it is thus natural to ask whether the attention mechanism can be used in the design of neural operators. Motivated by this, we study transformers in the function space setting. We formulate attention as a map between infinite dimensional function spaces and prove that the attention mechanism as implemented in practice is a Monte Carlo or finite difference approximation of this operator. The function space formulation allows for the design of transformer neural operators, a class…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

EdoardoCalvello/TransformerNeuralOperators
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsActivation Patching