Enhancing Visible-Infrared Person Re-identification with Modality- and   Instance-aware Visual Prompt Learning

Ruiqi Wu; Bingliang Jiao; Wenxuan Wang; Meng Liu; Peng Wang

arXiv:2406.12316·cs.CV·June 19, 2024

Enhancing Visible-Infrared Person Re-identification with Modality- and Instance-aware Visual Prompt Learning

Ruiqi Wu, Bingliang Jiao, Wenxuan Wang, Meng Liu, Peng Wang

PDF

TL;DR

This paper introduces a transformer-based Modality-aware and Instance-aware Visual Prompt (MIP) network for visible-infrared person re-identification, effectively utilizing both invariant and modality-specific features to improve matching accuracy across modalities.

Contribution

The paper proposes a novel MIP model with modality-specific and instance-specific prompts to reduce modality gap and enhance discriminative features in VI ReID tasks.

Findings

01

MIP outperforms most state-of-the-art methods on SYSU-MM01 and RegDB datasets.

02

Designed prompts effectively reduce modality interference and improve identification accuracy.

03

Extensive experiments validate the effectiveness of modality- and instance-aware prompts.

Abstract

The Visible-Infrared Person Re-identification (VI ReID) aims to match visible and infrared images of the same pedestrians across non-overlapped camera views. These two input modalities contain both invariant information, such as shape, and modality-specific details, such as color. An ideal model should utilize valuable information from both modalities during training for enhanced representational capability. However, the gap caused by modality-specific information poses substantial challenges for the VI ReID model to handle distinct modality inputs simultaneously. To address this, we introduce the Modality-aware and Instance-aware Visual Prompts (MIP) network in our work, designed to effectively utilize both invariant and specific information for identification. Specifically, our MIP model is built on the transformer architecture. In this model, we have designed a series of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.