Bridging the Gap: Multi-Level Cross-Modality Joint Alignment for Visible-Infrared Person Re-Identification
Tengfei Liang, Yi Jin, Wu Liu, Tao Wang, Songhe Feng, Yidong Li

TL;DR
This paper introduces a novel multi-level joint alignment method for visible-infrared person re-identification, effectively bridging modality and objective gaps to improve retrieval accuracy.
Contribution
It proposes the MCJA framework with new augmentation strategies and a ranking-list-based loss, addressing both modality discrepancy and optimization misalignment in VI-ReID.
Findings
Significant performance improvement over existing methods.
Effective reduction of modality discrepancy in image space.
Establishes a strong baseline for VI-ReID tasks.
Abstract
Visible-Infrared person Re-IDentification (VI-ReID) is a challenging cross-modality image retrieval task that aims to match pedestrians' images across visible and infrared cameras. To solve the modality gap, existing mainstream methods adopt a learning paradigm converting the image retrieval task into an image classification task with cross-entropy loss and auxiliary metric learning losses. These losses follow the strategy of adjusting the distribution of extracted embeddings to reduce the intra-class distance and increase the inter-class distance. However, such objectives do not precisely correspond to the final test setting of the retrieval task, resulting in a new gap at the optimization level. By rethinking these keys of VI-ReID, we propose a simple and effective method, the Multi-level Cross-modality Joint Alignment (MCJA), bridging both modality and objective-level gap. For the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
