Spatially and Temporally Efficient Non-local Attention Network for Video-based Person Re-Identification
Chih-Ting Liu, Chih-Wei Wu, Yu-Chiang Frank Wang, Shao-Yi Chien

TL;DR
This paper introduces a novel non-local attention network for video-based person re-identification that enhances feature representation by refining intermediate and high-level features while reducing computational complexity.
Contribution
It proposes the NVAN and STE-NVAN models that incorporate multi-level attention and exploit spatial-temporal redundancy for efficient and accurate person re-identification.
Findings
NVAN outperforms state-of-the-art by 3.8% in rank-1 accuracy on MARS.
STE-NVAN significantly reduces computation compared to existing methods.
Models effectively incorporate multi-level features and redundancy for improved performance.
Abstract
Video-based person re-identification (Re-ID) aims at matching video sequences of pedestrians across non-overlapping cameras. It is a practical yet challenging task of how to embed spatial and temporal information of a video into its feature representation. While most existing methods learn the video characteristics by aggregating image-wise features and designing attention mechanisms in Neural Networks, they only explore the correlation between frames at high-level features. In this work, we target at refining the intermediate features as well as high-level features with non-local attention operations and make two contributions. (i) We propose a Non-local Video Attention Network (NVAN) to incorporate video characteristics into the representation at multiple feature levels. (ii) We further introduce a Spatially and Temporally Efficient Non-local Video Attention Network (STE-NVAN) to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Advanced Neural Network Applications
