Interpreting and Improving Attention From the Perspective of Large Kernel Convolution
Chenghao Li, Chaoning Zhang, Boheng Zeng, Yi Lu, Pengbo Shi, Qingzi, Chen, Jirui Liu, Lingyun Zhu, Yang Yang, Heng Tao Shen

TL;DR
This paper introduces Large Kernel Convolutional Attention (LKCA), a novel approach that reinterprets attention as large-kernel convolution, combining local and global features efficiently for visual tasks, especially in resource-limited scenarios.
Contribution
The paper proposes LKCA, a unified convolutional attention mechanism that enhances local and global feature modeling, addressing limitations of traditional attention in data-scarce and resource-constrained settings.
Findings
LKCA outperforms traditional attention mechanisms on multiple datasets.
LKCA achieves competitive results with fewer resources.
LKCA effectively combines local and global features in vision tasks.
Abstract
Attention mechanisms have significantly advanced visual models by capturing global context effectively. However, their reliance on large-scale datasets and substantial computational resources poses challenges in data-scarce and resource-constrained scenarios. Moreover, traditional self-attention mechanisms lack inherent spatial inductive biases, making them suboptimal for modeling local features critical to tasks involving smaller datasets. In this work, we introduce Large Kernel Convolutional Attention (LKCA), a novel formulation that reinterprets attention operations as a single large-kernel convolution. This design unifies the strengths of convolutional architectures locality and translation invariance with the global context modeling capabilities of self-attention. By embedding these properties into a computationally efficient framework, LKCA addresses key limitations of traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Visual Attention and Saliency Detection
MethodsConvolution
