Rotate to Attend: Convolutional Triplet Attention Module

Diganta Misra; Trikay Nalamada; Ajay Uppili Arasanipalai; Qibin Hou

arXiv:2010.03045·cs.CV·November 9, 2020·51 cites

Rotate to Attend: Convolutional Triplet Attention Module

Diganta Misra, Trikay Nalamada, Ajay Uppili Arasanipalai, Qibin Hou

PDF

Open Access 5 Repos

TL;DR

This paper introduces a lightweight triplet attention module that captures cross-dimensional dependencies efficiently, improving performance in image classification and object detection tasks with minimal computational overhead.

Contribution

The paper proposes a novel triplet attention mechanism with a three-branch structure that effectively models inter-dimensional dependencies and can be easily integrated into existing networks.

Findings

01

Improves accuracy on ImageNet-1k classification

02

Enhances object detection performance on MSCOCO and PASCAL VOC

03

Maintains low computational cost and high efficiency

Abstract

Benefiting from the capability of building inter-dependencies among channels or spatial locations, attention mechanisms have been extensively studied and broadly used in a variety of computer vision tasks recently. In this paper, we investigate light-weight but effective attention mechanisms and present triplet attention, a novel method for computing attention weights by capturing cross-dimension interaction using a three-branch structure. For an input tensor, triplet attention builds inter-dimensional dependencies by the rotation operation followed by residual transformations and encodes inter-channel and spatial information with negligible computational overhead. Our method is simple as well as efficient and can be easily plugged into classic backbone networks as an add-on module. We demonstrate the effectiveness of our method on various challenging tasks including image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques

MethodsTriplet Attention · *Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Residual Connection · Batch Normalization · 1x1 Convolution · Max Pooling · Global Average Pooling · Bottleneck Residual Block · Residual Block