Cascade Prompt Learning for Vision-Language Model Adaptation
Ge Wu, Xin Zhang, Zheng Li, Zhaowei Chen, Jiajun Liang, Jian Yang and, Xiang Li

TL;DR
CasPL introduces a two-phase cascade prompt learning framework for vision-language models, improving adaptation to downstream tasks by capturing both domain-general and task-specific knowledge, reducing overfitting, and enhancing performance.
Contribution
The paper proposes a novel cascade prompt learning paradigm with two distinct prompt phases, enabling simultaneous extraction of domain-general and task-specific knowledge for better model adaptation.
Findings
CasPL outperforms previous methods like PromptSRC on multiple datasets.
It achieves a 1.85% to 3.44% improvement in classification accuracy.
CasPL maintains a good balance between performance and inference speed.
Abstract
Prompt learning has surfaced as an effective approach to enhance the performance of Vision-Language Models (VLMs) like CLIP when applied to downstream tasks. However, current learnable prompt tokens are primarily used for the single phase of adapting to tasks (i.e., adapting prompt), easily leading to overfitting risks. In this work, we propose a novel Cascade Prompt Learning CasPL framework to enable prompt learning to serve both generic and specific expertise (i.e., boosting and adapting prompt) simultaneously. Specifically, CasPL is a new learning paradigm comprising two distinct phases of learnable prompts: the first boosting prompt is crafted to extract domain-general knowledge from a senior larger CLIP teacher model by aligning their predicted logits using extensive unlabeled domain images. The second adapting prompt is then cascaded with the frozen first set to fine-tune the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsSparse Evolutionary Training · Balanced Selection · Contrastive Language-Image Pre-training
