ELFATT: Efficient Linear Fast Attention for Vision Transformers

Chong Wu; Maolin Che; Renjie Xu; Zhuoheng Ran; Hong Yan

arXiv:2501.06098·eess.IV·August 5, 2025·ACM Multimedia

ELFATT: Efficient Linear Fast Attention for Vision Transformers

Chong Wu, Maolin Che, Renjie Xu, Zhuoheng Ran, Hong Yan

PDF

Open Access

TL;DR

ELFATT introduces a novel linear attention mechanism that significantly speeds up vision transformers with minimal performance loss, suitable for high-resolution tasks and resource-constrained environments.

Contribution

The paper proposes ELFATT, a new linear attention method that reduces memory and computational complexity while maintaining high performance, compatible with FlashAttention-2 and applicable to various tasks.

Findings

01

4-7x speedup over vanilla attention in high-res vision tasks

02

2-3x speedup with FlashAttention-2 acceleration

03

Effective in non-vision long-range tasks and on edge GPUs

Abstract

The attention mechanism is the key to the success of transformers in different machine learning tasks. However, the quadratic complexity with respect to the sequence length of the vanilla softmax-based attention mechanism becomes the major bottleneck for the application of long sequence tasks, such as vision tasks. Although various efficient linear attention mechanisms have been proposed, they need to sacrifice performance to achieve high efficiency. What's more, memory-efficient methods, such as FlashAttention-1-3, still have quadratic computation complexity which can be further improved. In this paper, we propose a novel efficient linear fast attention (ELFATT) mechanism to achieve low memory input/output operations, linear computational complexity, and high performance at the same time. ELFATT offers 4-7x speedups over the vanilla softmax-based attention mechanism in high-resolution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Image Processing Techniques and Applications · Advanced Memory and Neural Computing

MethodsSoftmax · Attention Is All You Need · Diffusion