X-ReID: Multi-granularity Information Interaction for Video-Based Visible-Infrared Person Re-Identification

Chenyang Yu; Xuehu Liu; Pingping Zhang; Huchuan Lu

arXiv:2511.17964·cs.CV·November 26, 2025

X-ReID: Multi-granularity Information Interaction for Video-Based Visible-Infrared Person Re-Identification

Chenyang Yu, Xuehu Liu, Pingping Zhang, Huchuan Lu

PDF

Open Access 1 Video

TL;DR

This paper introduces X-ReID, a novel framework that effectively reduces modality gaps and leverages spatiotemporal information for video-based visible-infrared person re-identification, achieving superior results on large benchmarks.

Contribution

The paper proposes a cross-modality prototype collaboration and multi-granularity information interaction framework for improved VVI-ReID performance.

Findings

01

Outperforms state-of-the-art on HITSZ-VCM and BUPTCampus benchmarks.

02

Effectively reduces modality discrepancy and enhances temporal modeling.

03

Achieves robust sequence-level representations for VVI-ReID.

Abstract

Large-scale vision-language models (e.g., CLIP) have recently achieved remarkable performance in retrieval tasks, yet their potential for Video-based Visible-Infrared Person Re-Identification (VVI-ReID) remains largely unexplored. The primary challenges are narrowing the modality gap and leveraging spatiotemporal information in video sequences. To address the above issues, in this paper, we propose a novel cross-modality feature learning framework named X-ReID for VVI-ReID. Specifically, we first propose a Cross-modality Prototype Collaboration (CPC) to align and integrate features from different modalities, guiding the network to reduce the modality discrepancy. Then, a Multi-granularity Information Interaction (MII) is designed, incorporating short-term interactions from adjacent frames, long-term cross-frame information fusion, and cross-modality feature alignment to enhance temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

X-ReID: Multi-granularity Information Interaction for Video-Based Visible-Infrared Person Re-Identification· underline

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Human Pose and Action Recognition