Rethinking Temporal Fusion for Video-based Person Re-identification on Semantic and Time Aspect
Xinyang Jiang, Yifei Gong, Xiaowei Guo, Qize Yang, Feiyue Huang,, Weishi Zheng, Feng Zheng, Xing Sun

TL;DR
This paper introduces a novel temporal fusion framework for video-based person re-identification that considers semantic differences and frame relationships, significantly improving accuracy.
Contribution
It proposes a multi-stage semantic fusion network and an inter-frame attention module to enhance feature richness and reduce redundancy in video ReID.
Findings
Achieves state-of-the-art re-identification accuracy.
Effectively reduces information loss and redundancy.
Improves feature aggregation across semantic levels and temporal relationships.
Abstract
Recently, the research interest of person re-identification (ReID) has gradually turned to video-based methods, which acquire a person representation by aggregating frame features of an entire video. However, existing video-based ReID methods do not consider the semantic difference brought by the outputs of different network stages, which potentially compromises the information richness of the person features. Furthermore, traditional methods ignore important relationship among frames, which causes information redundancy in fusion along the time axis. To address these issues, we propose a novel general temporal fusion framework to aggregate frame features on both semantic aspect and time aspect. As for the semantic aspect, a multi-stage fusion network is explored to fuse richer frame features at multiple semantic levels, which can effectively reduce the information loss caused by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Gait Recognition and Analysis
