Explicit Uncertainty Modeling for Active CLIP Adaptation with Dual Prompt Tuning

Qian-Wei Wang; Yaguang Song; Shu-Tao Xia

arXiv:2602.04340·cs.CV·February 5, 2026

Explicit Uncertainty Modeling for Active CLIP Adaptation with Dual Prompt Tuning

Qian-Wei Wang, Yaguang Song, Shu-Tao Xia

PDF

Open Access

TL;DR

This paper introduces a novel uncertainty modeling approach for active CLIP adaptation using dual prompt tuning, which improves sample selection and classification performance in limited annotation scenarios.

Contribution

It proposes a dual-prompt tuning framework that explicitly models uncertainty from the model perspective, enhancing active learning for CLIP-based image classification.

Findings

01

Outperforms existing active learning methods under the same annotation budget

02

Improves classification reliability through positive prompt tuning

03

Provides a principled uncertainty signal for sample selection

Abstract

Pre-trained vision-language models such as CLIP exhibit strong transferability, yet adapting them to downstream image classification tasks under limited annotation budgets remains challenging. In active learning settings, the model must select the most informative samples for annotation from a large pool of unlabeled data. Existing approaches typically estimate uncertainty via entropy-based criteria or representation clustering, without explicitly modeling uncertainty from the model perspective. In this work, we propose a robust uncertainty modeling framework for active CLIP adaptation based on dual-prompt tuning. We introduce two learnable prompts in the textual branch of CLIP. The positive prompt enhances the discriminability of task-specific textual embeddings corresponding to light-weight tuned visual embeddings, improving classification reliability. Meanwhile, the negative prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques