TF-CLIP: Learning Text-free CLIP for Video-based Person Re-Identification
Chenyang Yu, Xuehu Liu, Yingquan Wang, Pingping Zhang and, Huchuan Lu

TL;DR
TF-CLIP introduces a text-free, CLIP-based framework for video person re-identification that leverages identity-specific memory and temporal diffusion to improve performance without relying on text descriptions.
Contribution
The paper proposes a novel one-stage, text-free CLIP-based learning framework with memory modules and temporal diffusion for enhanced video-based person ReID.
Findings
Outperforms state-of-the-art methods on MARS, LS-VID, and iLIDS-VID datasets.
Introduces CLIP-Memory and Temporal Memory Diffusion modules for better temporal feature extraction.
Demonstrates effectiveness of text-free approach in video person ReID.
Abstract
Large-scale language-image pre-trained models (e.g., CLIP) have shown superior performances on many cross-modal retrieval tasks. However, the problem of transferring the knowledge learned from such models to video-based person re-identification (ReID) has barely been explored. In addition, there is a lack of decent text descriptions in current ReID benchmarks. To address these issues, in this work, we propose a novel one-stage text-free CLIP-based learning framework named TF-CLIP for video-based person ReID. More specifically, we extract the identity-specific sequence feature as the CLIP-Memory to replace the text feature. Meanwhile, we design a Sequence-Specific Prompt (SSP) module to update the CLIP-Memory online. To capture temporal information, we further propose a Temporal Memory Diffusion (TMD) module, which consists of two key components: Temporal Memory Construction (TMC) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Gait Recognition and Analysis · Human Pose and Action Recognition
MethodsDiffusion
