InterPartAbility: Text-Guided Part Matching for Interpretable Person Re-Identification

Shakeeb Murtaza; Aryan Shukla; Rajarshi Bhattacharya; Maguelonne Heritier; Eric Granger

arXiv:2604.27122·cs.CV·May 1, 2026

InterPartAbility: Text-Guided Part Matching for Interpretable Person Re-Identification

Shakeeb Murtaza, Aryan Shukla, Rajarshi Bhattacharya, Maguelonne Heritier, Eric Granger

PDF

TL;DR

InterPartAbility introduces an interpretable person re-identification method that explicitly matches image parts to text phrases, providing grounded explanations and maintaining high retrieval accuracy.

Contribution

It proposes a novel patch-phrase interaction module and spatial attention constraints to enhance interpretability in text-guided person re-identification.

Findings

01

Achieves state-of-the-art interpretability on CUHK-PEDES and ICFG-PEDES datasets.

02

Maintains competitive retrieval accuracy while providing grounded explanations.

03

Introduces a new quantitative interpretability evaluation protocol.

Abstract

Text-to-image person re-identification (TI-ReID) relies on natural-language text description to retrieve top matching individuals from a large gallery of images. While recent large vision-language models (VLMs) achieve strong retrieval performance, their decisions remain largely uninterpretable. Existing interpretability approaches in TI-ReID rely solely on slot-attention to highlight attended regions, but fail to reliably bind visual regions to semantically meaningful concepts, limiting explanations to qualitative visualizations over a restricted vocabulary. This paper introduces InterPartAbility, an interpretable TI-ReID method that performs explicit part-wise matching and enables phrase-region grounding. A new open-vocabulary, lightweight supervision, patch-phrase interaction module (PPIM) is proposed to train a standard TI-ReID model with concept-level guidance. Concept-based part…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.