Rotate to Attend: Convolutional Triplet Attention Module
Diganta Misra, Trikay Nalamada, Ajay Uppili Arasanipalai, Qibin Hou

TL;DR
This paper introduces a lightweight triplet attention module that captures cross-dimensional dependencies efficiently, improving performance in image classification and object detection tasks with minimal computational overhead.
Contribution
The paper proposes a novel triplet attention mechanism with a three-branch structure that effectively models inter-dimensional dependencies and can be easily integrated into existing networks.
Findings
Improves accuracy on ImageNet-1k classification
Enhances object detection performance on MSCOCO and PASCAL VOC
Maintains low computational cost and high efficiency
Abstract
Benefiting from the capability of building inter-dependencies among channels or spatial locations, attention mechanisms have been extensively studied and broadly used in a variety of computer vision tasks recently. In this paper, we investigate light-weight but effective attention mechanisms and present triplet attention, a novel method for computing attention weights by capturing cross-dimension interaction using a three-branch structure. For an input tensor, triplet attention builds inter-dimensional dependencies by the rotation operation followed by residual transformations and encodes inter-channel and spatial information with negligible computational overhead. Our method is simple as well as efficient and can be easily plugged into classic backbone networks as an add-on module. We demonstrate the effectiveness of our method on various challenging tasks including image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
MethodsTriplet Attention · *Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Residual Connection · Batch Normalization · 1x1 Convolution · Max Pooling · Global Average Pooling · Bottleneck Residual Block · Residual Block
