Prototype-guided Cross-modal Completion and Alignment for Incomplete   Text-based Person Re-identification

Tiantian Gong; Guodong Du; Junsheng Wang; Yongkang Ding; Liyan Zhang

arXiv:2309.17104·cs.CV·October 4, 2023

Prototype-guided Cross-modal Completion and Alignment for Incomplete Text-based Person Re-identification

Tiantian Gong, Guodong Du, Junsheng Wang, Yongkang Ding, Liyan Zhang

PDF

TL;DR

This paper introduces a Prototype-guided Cross-modal Completion and Alignment framework to improve incomplete text-based person re-identification by effectively handling missing modality data and enhancing fine-grained cross-modal alignment.

Contribution

The proposed PCCA framework innovatively uses cross-modal nearest neighbor construction, relation graphs, and prototype-aware alignment to address incomplete data and improve re-identification accuracy.

Findings

01

Outperforms state-of-the-art methods on multiple benchmarks.

02

Effectively handles various missing data ratios.

03

Enhances fine-grained cross-modal alignment.

Abstract

Traditional text-based person re-identification (ReID) techniques heavily rely on fully matched multi-modal data, which is an ideal scenario. However, due to inevitable data missing and corruption during the collection and processing of cross-modal data, the incomplete data issue is usually met in real-world applications. Therefore, we consider a more practical task termed the incomplete text-based ReID task, where person images and text descriptions are not completely matched and contain partially missing modality data. To this end, we propose a novel Prototype-guided Cross-modal Completion and Alignment (PCCA) framework to handle the aforementioned issues for incomplete text-based ReID. Specifically, we cannot directly retrieve person images based on a text query on missing modality data. Therefore, we propose the cross-modal nearest neighbor construction strategy for missing data by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.