ELA: Efficient Local Attention for Deep Convolutional Neural Networks
Wei Xu, Yi Wan

TL;DR
The paper introduces ELA, a lightweight local attention module that enhances deep CNNs' performance across various vision tasks by efficiently encoding positional information without increasing complexity.
Contribution
It proposes a novel ELA method that overcomes limitations of existing attention mechanisms using 1D convolution and Group Normalization, with multiple adaptable versions for different tasks.
Findings
ELA outperforms state-of-the-art methods on ImageNet, MSCOCO, and Pascal VOC datasets.
ELA improves accuracy in image classification, object detection, and semantic segmentation.
The method is compatible with popular CNN architectures like ResNet, MobileNet, and DeepLab.
Abstract
The attention mechanism has gained significant recognition in the field of computer vision due to its ability to effectively enhance the performance of deep neural networks. However, existing methods often struggle to effectively utilize spatial information or, if they do, they come at the cost of reducing channel dimensions or increasing the complexity of neural networks. In order to address these limitations, this paper introduces an Efficient Local Attention (ELA) method that achieves substantial performance improvements with a simple structure. By analyzing the limitations of the Coordinate Attention method, we identify the lack of generalization ability in Batch Normalization, the adverse effects of dimension reduction on channel attention, and the complexity of attention generation process. To overcome these challenges, we propose the incorporation of 1D convolution and Group…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsAverage Pooling · Conditional Random Field · Dense Connections · Max Pooling · Kaiming Initialization · Batch Normalization · Global Average Pooling · Feedforward Network · Dilated Convolution · Group Normalization
