Video-based Person Re-identification with Spatial and Temporal Memory Networks
Chanho Eom, Geon Lee, Junghyup Lee, Bumsub Ham

TL;DR
This paper introduces Spatial and Temporal Memory Networks (STMN) for video-based person re-identification, effectively handling distractors by leveraging learned spatial and temporal patterns to improve sequence-level person representations.
Contribution
The novel STMN architecture models spatial and temporal distractors with dedicated memories, enhancing feature refinement and aggregation for better re-identification accuracy.
Findings
Outperforms existing methods on MARS, DukeMTMC-VideoReID, and LS-VID benchmarks.
Effectively handles spatial distractors like background clutter.
Successfully models temporal patterns such as partial occlusions.
Abstract
Video-based person re-identification (reID) aims to retrieve person videos with the same identity as a query person across multiple cameras. Spatial and temporal distractors in person videos, such as background clutter and partial occlusions over frames, respectively, make this task much more challenging than image-based person reID. We observe that spatial distractors appear consistently in a particular location, and temporal distractors show several patterns, e.g., partial occlusions occur in the first few frames, where such patterns provide informative cues for predicting which frames to focus on (i.e., temporal attentions). Based on this, we introduce a novel Spatial and Temporal Memory Networks (STMN). The spatial memory stores features for spatial distractors that frequently emerge across video frames, while the temporal memory saves attentions which are optimized for typical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Gait Recognition and Analysis · Face recognition and analysis
