Explicitly Modeled Attention Maps for Image Classification

Andong Tan; Duc Tam Nguyen; Maximilian Dax; Matthias Nie{\ss}ner,; Thomas Brox

arXiv:2006.07872·cs.CV·March 19, 2021

Explicitly Modeled Attention Maps for Image Classification

Andong Tan, Duc Tam Nguyen, Maximilian Dax, Matthias Nie{\ss}ner,, Thomas Brox

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel self-attention module with explicitly modeled attention-maps using geometric priors, reducing computational costs while improving accuracy in image classification tasks.

Contribution

The paper proposes a simple, efficient self-attention mechanism with explicit attention-maps based on geometric priors, requiring only a single learnable parameter.

Findings

01

Achieves up to 2.2% accuracy improvement over ResNet baselines on ImageNet.

02

Outperforms other self-attention methods like AA-ResNet152 in accuracy by 0.9%.

03

Uses fewer parameters and GFLOPs, demonstrating computational efficiency.

Abstract

Self-attention networks have shown remarkable progress in computer vision tasks such as image classification. The main benefit of the self-attention mechanism is the ability to capture long-range feature interactions in attention-maps. However, the computation of attention-maps requires a learnable key, query, and positional encoding, whose usage is often not intuitive and computationally expensive. To mitigate this problem, we propose a novel self-attention module with explicitly modeled attention-maps using only a single learnable parameter for low computational overhead. The design of explicitly modeled attention-maps using geometric prior is based on the observation that the spatial context for a given pixel within an image is mostly dominated by its neighbors, while more distant pixels have a minor contribution. Concretely, the attention-maps are parametrized via simple functions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Explicitly Modeled Attention Maps for Image Classification· underline

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning