Cross-Layer Retrospective Retrieving via Layer Attention
Yanwen Fang, Yuxi Cai, Jintai Chen, Jingyu Zhao, Guangjian Tian,, Guodong Li

TL;DR
This paper introduces a cross-layer attention mechanism called MRLA that enhances neural network representations by retrieving information from previous layers, improving performance across vision tasks with minimal additional computation.
Contribution
The paper proposes a novel multi-head recurrent layer attention mechanism that effectively integrates cross-layer information, boosting vision network performance with a lightweight design.
Findings
Improves ResNet-50 Top-1 accuracy by 1.6%.
Enhances dense prediction tasks with 3-4% gains in box and mask AP.
Reduces computational cost with a lightweight version of MRLA.
Abstract
More and more evidence has shown that strengthening layer interactions can enhance the representation power of a deep neural network, while self-attention excels at learning interdependencies by retrieving query-activated information. Motivated by this, we devise a cross-layer attention mechanism, called multi-head recurrent layer attention (MRLA), that sends a query representation of the current layer to all previous layers to retrieve query-related information from different levels of receptive fields. A light-weighted version of MRLA is also proposed to reduce the quadratic computation cost. The proposed layer attention mechanism can enrich the representation power of many state-of-the-art vision networks, including CNNs and vision transformers. Its effectiveness has been extensively evaluated in image classification, object detection and instance segmentation tasks, where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
