Synthesizer Based Efficient Self-Attention for Vision Tasks
Guangyang Zhu, Jianfeng Zhang, Yuanzhi Feng, Hai Lan

TL;DR
This paper introduces a novel tensor transformation-based self-attention module called STT, which reduces redundant computation and preserves tensor structure, improving efficiency and robustness in vision tasks like classification and captioning.
Contribution
The paper proposes the STT module that processes image tensor features directly, avoiding dot-product multiplication and maintaining tensor structure, which is a novel approach in vision self-attention.
Findings
STT achieves competitive performance on image classification.
STT maintains robustness compared to traditional self-attention.
STT reduces computational redundancy in vision models.
Abstract
Self-attention module shows outstanding competence in capturing long-range relationships while enhancing performance on vision tasks, such as image classification and image captioning. However, the self-attention module highly relies on the dot product multiplication and dimension alignment among query-key-value features, which cause two problems: (1) The dot product multiplication results in exhaustive and redundant computation. (2) Due to the visual feature map often appearing as a multi-dimensional tensor, reshaping the scale of the tensor feature to adapt to the dimension alignment might destroy the internal structure of the tensor feature map. To address these problems, this paper proposes a self-attention plug-in module with its variants, namely, Synthesizing Tensor Transformations (STT), for directly processing image tensor features. Without computing the dot-product…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Tensor decomposition and applications
