Embedding and Enriching Explicit Semantics for Visible-Infrared Person   Re-Identification

Neng Dong; Shuanglin Yan; Liyan Zhang; Jinhui Tang

arXiv:2412.08406·cs.CV·December 12, 2024

Embedding and Enriching Explicit Semantics for Visible-Infrared Person Re-Identification

Neng Dong, Shuanglin Yan, Liyan Zhang, Jinhui Tang

PDF

Open Access

TL;DR

This paper introduces a novel framework for visible-infrared person re-identification that leverages explicit semantic embeddings, cross-view semantics compensation, and semantics purification to improve cross-modality matching accuracy.

Contribution

The paper presents a new EEES framework combining large language-vision models, multi-view semantics compensation, and semantics purification for enhanced VIReID performance.

Findings

01

EEES outperforms existing methods on benchmark datasets.

02

Explicit semantics embedding improves cross-modality alignment.

03

Semantics purification reduces noise from conflicting attributes.

Abstract

Visible-infrared person re-identification (VIReID) retrieves pedestrian images with the same identity across different modalities. Existing methods learn visual content solely from images, lacking the capability to sense high-level semantics. In this paper, we propose an Embedding and Enriching Explicit Semantics (EEES) framework to learn semantically rich cross-modality pedestrian representations. Our method offers several contributions. First, with the collaboration of multiple large language-vision models, we develop Explicit Semantics Embedding (ESE), which automatically supplements language descriptions for pedestrians and aligns image-text pairs into a common space, thereby learning visual content associated with explicit semantics. Second, recognizing the complementarity of multi-view information, we present Cross-View Semantics Compensation (CVSC), which constructs multi-view…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Face recognition and analysis · Gait Recognition and Analysis