All the attention you need: Global-local, spatial-channel attention for image retrieval
Chull Hwan Song, Hye Joo Han, Yannis Avrithis

TL;DR
This paper introduces GLAM, a comprehensive attention module combining all four forms of attention—local/global and spatial/channel—to enhance image retrieval performance, achieving state-of-the-art results on standard benchmarks.
Contribution
The paper proposes a novel global-local attention module (GLAM) that integrates all four forms of attention for improved image retrieval representations.
Findings
GLAM improves retrieval accuracy on benchmark datasets.
All four forms of attention interact to enhance feature representation.
State-of-the-art performance achieved with the proposed method.
Abstract
We address representation learning for large-scale instance-level image retrieval. Apart from backbone, training pipelines and loss functions, popular approaches have focused on different spatial pooling and attention mechanisms, which are at the core of learning a powerful global image representation. There are different forms of attention according to the interaction of elements of the feature tensor (local and global) and the dimensions where it is applied (spatial and channel). Unfortunately, each study addresses only one or two forms of attention and applies it to different problems like classification, detection or retrieval. We present global-local attention module (GLAM), which is attached at the end of a backbone network and incorporates all four forms of attention: local and global, spatial and channel. We obtain a new feature tensor and, by spatial pooling, we learn a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
All the attention you need: Global-local, spatial-channel attention for image retrieval· youtube
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques
MethodsGlobal Local Attention Module · Global-Local Attention
