FMMformer: Efficient and Flexible Transformer via Decomposed Near-field   and Far-field Attention

Tan M. Nguyen; Vai Suliafu; Stanley J. Osher; Long Chen and; Bao Wang

arXiv:2108.02347·cs.LG·August 6, 2021·5 cites

FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Tan M. Nguyen, Vai Suliafu, Stanley J. Osher, Long Chen and, Bao Wang

PDF

Open Access 1 Video

TL;DR

FMMformers introduce a novel transformer architecture inspired by the fast multipole method, decomposing attention into near-field and far-field components to achieve linear complexity and improved accuracy on long-range tasks.

Contribution

This work presents FMMformers, a new efficient transformer model that decomposes attention into near-field and far-field parts, reducing complexity and enhancing performance.

Findings

01

Achieve linear complexity in sequence length

02

Outperform standard transformers on Long Range Arena benchmarks

03

Improve accuracy significantly in language modeling tasks

Abstract

We propose FMMformers, a class of efficient and flexible transformers inspired by the celebrated fast multipole method (FMM) for accelerating interacting particle simulation. FMM decomposes particle-particle interaction into near-field and far-field components and then performs direct and coarse-grained computation, respectively. Similarly, FMMformers decompose the attention into near-field and far-field attention, modeling the near-field attention by a banded matrix and the far-field attention by a low-rank matrix. Computing the attention matrix for FMMformers requires linear complexity in computational time and memory footprint with respect to the sequence length. In contrast, standard transformers suffer from quadratic complexity. We analyze and validate the advantage of FMMformers over the standard transformer on the Long Range Arena and language modeling benchmarks. FMMformers can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention· slideslive

Taxonomy

TopicsParticle accelerators and beam dynamics · Computational Physics and Python Applications · Superconducting Materials and Applications