# Generalizable Object Re-Identification via Visual In-Context Prompting

**Authors:** Zhizhong Huang, Xiaoming Liu

arXiv: 2508.21222 · 2025-09-01

## TL;DR

VICP introduces a zero-shot object re-identification framework that leverages in-context prompts and combines language models with vision models to generalize to unseen categories without retraining.

## Contribution

The paper presents VICP, a novel approach that uses in-context learning with LLMs and vision models to enable generalizable object ReID without dataset-specific retraining.

## Key findings

- VICP outperforms baselines on unseen categories.
- Introduces ShopID10K dataset for evaluation.
- Effective zero-shot generalization demonstrated.

## Abstract

Current object re-identification (ReID) methods train domain-specific models (e.g., for persons or vehicles), which lack generalization and demand costly labeled data for new categories. While self-supervised learning reduces annotation needs by learning instance-wise invariance, it struggles to capture \textit{identity-sensitive} features critical for ReID. This paper proposes Visual In-Context Prompting~(VICP), a novel framework where models trained on seen categories can directly generalize to unseen novel categories using only \textit{in-context examples} as prompts, without requiring parameter adaptation. VICP synergizes LLMs and vision foundation models~(VFM): LLMs infer semantic identity rules from few-shot positive/negative pairs through task-specific prompting, which then guides a VFM (\eg, DINO) to extract ID-discriminative features via \textit{dynamic visual prompts}. By aligning LLM-derived semantic concepts with the VFM's pre-trained prior, VICP enables generalization to novel categories, eliminating the need for dataset-specific retraining. To support evaluation, we introduce ShopID10K, a dataset of 10K object instances from e-commerce platforms, featuring multi-view images and cross-domain testing. Experiments on ShopID10K and diverse ReID benchmarks demonstrate that VICP outperforms baselines by a clear margin on unseen categories. Code is available at https://github.com/Hzzone/VICP.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21222/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21222/full.md

## References

78 references — full list in the complete paper: https://tomesphere.com/paper/2508.21222/full.md

---
Source: https://tomesphere.com/paper/2508.21222