TL;DR
This paper introduces an attention pyramid approach for person re-identification that captures multi-scale attention regions, mimicking human visual perception, and improves accuracy with minimal computational overhead.
Contribution
The paper presents a novel attention pyramid module that enhances existing attention mechanisms by multi-scale focus, significantly boosting re-identification performance.
Findings
Outperforms state-of-the-art methods on four benchmarks.
Effective across different attention mechanisms like channel-wise and spatial attention.
Lightweight and easy to integrate into existing models.
Abstract
In this paper, we propose an attention pyramid method for person re-identification. Unlike conventional attention-based methods which only learn a global attention map, our attention pyramid exploits the attention regions in a multi-scale manner because human attention varies with different scales. Our attention pyramid imitates the process of human visual perception which tends to notice the foreground person over the cluttered background, and further focus on the specific color of the shirt with close observation. Specifically, we describe our attention pyramid by a "split-attend-merge-stack" principle. We first split the features into multiple local parts and learn the corresponding attentions. Then, we merge local attentions and stack these merged attentions with the residual connection as an attention pyramid. The proposed attention pyramid is a lightweight plug-and-play module…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsResidual Connection
