Is Attention Better Than Matrix Decomposition?
Zhengyang Geng, Meng-Hao Guo, Hongxu Chen, Xia Li, Ke Wei, Zhouchen, Lin

TL;DR
This paper challenges the supremacy of self-attention in deep learning by showing that matrix decomposition methods, modeled as low-rank recovery problems, can perform comparably or better in encoding global context with less computational cost.
Contribution
It introduces Hamburgers, a novel approach using matrix decomposition algorithms to model global context, providing an alternative to self-attention with demonstrated effectiveness in vision tasks.
Findings
Matrix decomposition can match or outperform self-attention in global context modeling.
Hamburgers improve performance in semantic segmentation and image generation.
The approach reduces computational costs compared to traditional self-attention.
Abstract
As an essential ingredient of modern deep learning, attention mechanism, especially self-attention, plays a vital role in the global correlation discovery. However, is hand-crafted attention irreplaceable when modeling the global context? Our intriguing finding is that self-attention is not better than the matrix decomposition (MD) model developed 20 years ago regarding the performance and computational cost for encoding the long-distance dependencies. We model the global context issue as a low-rank recovery problem and show that its optimization algorithms can help design global information blocks. This paper then proposes a series of Hamburgers, in which we employ the optimization algorithms for solving MDs to factorize the input representations into sub-matrices and reconstruct a low-rank embedding. Hamburgers with different MDs can perform favorably against the popular global…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
MethodsHamburger
