Learning Cross-modality Information Bottleneck Representation for Heterogeneous Person Re-Identification
Haichao Shi, Mandi Luo, Xiao-Yu Zhang, Ran He

TL;DR
This paper introduces CMInfoNet, a novel network that leverages mutual information bottleneck and modality consensus to improve cross-modality person re-identification by reducing redundancy and enhancing modality-specific features.
Contribution
The paper proposes a mutual information-based bottleneck and an automatic search strategy for key parts, along with a modality consensus module, to better extract identity features in VI-ReID.
Findings
Achieved state-of-the-art results on multiple VI-ReID benchmarks.
Effectively reduces modality discrepancy and information redundancy.
Improves key parts discrimination and modality alignment.
Abstract
Visible-Infrared person re-identification (VI-ReID) is an important and challenging task in intelligent video surveillance. Existing methods mainly focus on learning a shared feature space to reduce the modality discrepancy between visible and infrared modalities, which still leave two problems underexplored: information redundancy and modality complementarity. To this end, properly eliminating the identity-irrelevant information as well as making up for the modality-specific information are critical and remains a challenging endeavor. To tackle the above problems, we present a novel mutual information and modality consensus network, namely CMInfoNet, to extract modality-invariant identity features with the most representative information and reduce the redundancies. The key insight of our method is to find an optimal representation to capture more identity-relevant information and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Human Pose and Action Recognition
MethodsALIGN · Focus
