Prototypical Contrastive Learning-based CLIP Fine-tuning for Object Re-identification

Jiachen Li; Xiaojin Gong

arXiv:2310.17218·cs.CV·January 15, 2026·5 cites

Prototypical Contrastive Learning-based CLIP Fine-tuning for Object Re-identification

Jiachen Li, Xiaojin Gong

PDF

Open Access 1 Repo

TL;DR

This paper introduces a simple prototypical contrastive learning method to fine-tune CLIP for object re-identification, outperforming prompt learning approaches in supervised and unsupervised settings.

Contribution

It proposes a novel PCL-based fine-tuning approach for CLIP that removes prompt learning and improves Re-ID performance in various supervision scenarios.

Findings

01

Outperforms CLIP-ReID in supervised Re-ID tasks

02

Achieves state-of-the-art results in unsupervised Re-ID

03

Eliminates the need for prompt learning in CLIP fine-tuning

Abstract

This work aims to adapt large-scale pre-trained vision-language models, such as contrastive language-image pretraining (CLIP), to enhance the performance of object reidentification (Re-ID) across various supervision settings. Although prompt learning has enabled a recent work named CLIP-ReID to achieve promising performance, the underlying mechanisms and the necessity of prompt learning remain unclear due to the absence of semantic labels in ReID tasks. In this work, we first analyze the role prompt learning in CLIP-ReID and identify its limitations. Based on our investigations, we propose a simple yet effective approach to adapt CLIP for supervised object Re-ID. Our approach directly fine-tunes the image encoder of CLIP using a prototypical contrastive learning (PCL) loss, eliminating the need for prompt learning. Experimental results on both person and vehicle Re-ID datasets…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RikoLi/PCL-CLIP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsContrastive Learning · Contrastive Language-Image Pre-training