DINOv2 Driven Gait Representation Learning for Video-Based Visible-Infrared Person Re-identification
Yujie Yang, Shuang Li, Jun Ye, Neng Dong, Fan Li, Huafeng Li

TL;DR
This paper introduces a novel gait representation learning framework for video-based visible-infrared person re-identification, leveraging DINOv2 priors and bidirectional multi-granularity enhancement to improve cross-modal matching accuracy.
Contribution
It proposes a DINOv2-driven gait learning framework with semantic-aware silhouette enhancement and bidirectional multi-granularity refinement, addressing limitations of appearance-only methods.
Findings
Outperforms state-of-the-art on HITSZ-VCM and BUPT datasets.
Effectively integrates gait features with appearance cues for robust cross-modal re-identification.
Demonstrates significant accuracy improvements over existing methods.
Abstract
Video-based Visible-Infrared person re-identification (VVI-ReID) aims to retrieve the same pedestrian across visible and infrared modalities from video sequences. Existing methods tend to exploit modality-invariant visual features but largely overlook gait features, which are not only modality-invariant but also rich in temporal dynamics, thus limiting their ability to model the spatiotemporal consistency essential for cross-modal video matching. To address these challenges, we propose a DINOv2-Driven Gait Representation Learning (DinoGRL) framework that leverages the rich visual priors of DINOv2 to learn gait features complementary to appearance cues, facilitating robust sequence-level representations for cross-modal retrieval. Specifically, we introduce a Semantic-Aware Silhouette and Gait Learning (SASGL) model, which generates and enhances silhouette representations with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGait Recognition and Analysis · Human Pose and Action Recognition · Video Surveillance and Tracking Methods
