Attention Mechanisms in Computer Vision: A Survey

Meng-Hao Guo; Tian-Xing Xu; Jiang-Jiang Liu; Zheng-Ning Liu; Peng-Tao; Jiang; Tai-Jiang Mu; Song-Hai Zhang; Ralph R. Martin; Ming-Ming Cheng,; Shi-Min Hu

arXiv:2111.07624·cs.CV·July 6, 2022

Attention Mechanisms in Computer Vision: A Survey

Meng-Hao Guo, Tian-Xing Xu, Jiang-Jiang Liu, Zheng-Ning Liu, Peng-Tao, Jiang, Tai-Jiang Mu, Song-Hai Zhang, Ralph R. Martin, Ming-Ming Cheng,, Shi-Min Hu

PDF

1 Repo

TL;DR

This survey reviews the development and application of various attention mechanisms in computer vision, highlighting their success across multiple tasks and categorizing approaches like channel, spatial, temporal, and branch attention.

Contribution

It provides a comprehensive categorization and analysis of attention mechanisms in computer vision, along with a curated repository and future research directions.

Findings

01

Attention mechanisms significantly improve performance in visual tasks.

02

Categorization of attention types aids understanding and development.

03

Future directions include exploring new attention architectures.

Abstract

Humans can naturally and effectively find salient regions in complex scenes. Motivated by this observation, attention mechanisms were introduced into computer vision with the aim of imitating this aspect of the human visual system. Such an attention mechanism can be regarded as a dynamic weight adjustment process based on features of the input image. Attention mechanisms have achieved great success in many visual tasks, including image classification, object detection, semantic segmentation, video understanding, image generation, 3D vision, multi-modal tasks and self-supervised learning. In this survey, we provide a comprehensive review of various attention mechanisms in computer vision and categorize them according to approach, such as channel attention, spatial attention, temporal attention and branch attention; a related repository…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MenghaoGuo/Awesome-Vision-Attentions
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.