Prompt Tuning Vision Language Models with Margin Regularizer for Few-Shot Learning under Distribution Shifts
Debarshi Brahma, Anuska Roy, Soma Biswas

TL;DR
This paper introduces PromptMargin, a prompt-tuning method with a margin regularizer, to adapt large vision-language models for few-shot learning under distribution shifts, improving class discrimination and robustness.
Contribution
The paper proposes a novel prompt-tuning approach with a multimodal margin regularizer for effective adaptation of VLMs in few-shot, distribution-shifted scenarios, addressing overfitting and generalization issues.
Findings
PromptMargin outperforms state-of-the-art methods on 15 benchmark datasets.
The margin regularizer enhances class discrimination under distribution shifts.
Selective augmentation improves training with limited samples.
Abstract
Recently, Vision-Language foundation models like CLIP and ALIGN, which are pre-trained on large-scale data have shown remarkable zero-shot generalization to diverse datasets with different classes and even domains. In this work, we take a step further and analyze whether these models can be adapted to target datasets having very different distributions and classes compared to what these models have been trained on, using only a few labeled examples from the target dataset. In such scenarios, finetuning large pretrained models is challenging due to problems of overfitting as well as loss of generalization, and has not been well explored in prior literature. Since, the pre-training data of such models are unavailable, it is difficult to comprehend the performance on various downstream datasets. First, we try to answer the question: Given a target dataset with a few labelled examples, can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
MethodsContrastive Language-Image Pre-training · ALIGN
