Multiple Stochastic Prompt Tuning for Few-shot Adaptation under Extreme Domain Shift
Debarshi Brahma, Soma Biswas

TL;DR
This paper introduces MIST, a novel prompt tuning framework that adapts CLIP to extreme domain shifts with few labeled examples by using multiple learnable prompts modeled as Gaussian distributions, improving generalization in real-world scenarios.
Contribution
The paper proposes multiple stochastic prompt tuning with Gaussian modeling to handle extreme domain shifts in few-shot learning, addressing limitations of existing methods in real-world multi-class settings.
Findings
MIST outperforms state-of-the-art methods on various datasets.
Using Gaussian distributions for prompts enhances generalization.
Multiple prompts capture diverse visual modes effectively.
Abstract
Foundation Vision-Language Models (VLMs) like CLIP exhibit strong generalization capabilities due to large-scale pretraining on diverse image-text pairs. However, their performance often degrades when applied to target datasets with significant distribution shifts in both visual appearance and class semantics. Recent few-shot learning approaches adapt CLIP to downstream tasks using limited labeled data via adapter or prompt tuning, but are not specifically designed to handle such extreme domain shifts. Conversely, some works addressing cross-domain few-shot learning consider such domain-shifted scenarios but operate in an episodic setting with only a few classes per episode, limiting their applicability to real-world deployment, where all classes must be handled simultaneously. To address this gap, we propose a novel framework, MIST (Multiple Stochastic Prompt Tuning), for efficiently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
