Differentiable Prompt Learning for Vision Language Models
Zhenhan Huang, Tejaswini Pedapati, Pin-Yu Chen, Jianxi Gao

TL;DR
This paper introduces Differentiable Prompt Learning (DPL), an automated method to optimize prompt configurations in vision-language models, significantly improving downstream task performance over manual prompt strategies.
Contribution
The paper proposes DPL, a novel optimization-based approach for automatic prompt configuration, enhancing deep continuous prompt design in pre-trained models.
Findings
DPL achieves a 2.60% average accuracy boost on 11 datasets.
DPL effectively finds high-confidence prompt configurations with limited data.
The method is compatible with existing sophisticated prompt designs.
Abstract
Prompt learning is an effective way to exploit the potential of large-scale pre-trained foundational models. Continuous prompts parameterize context tokens in prompts by turning them into differentiable vectors. Deep continuous prompts insert prompts not only in the input but also in the intermediate hidden representations. Manually designed deep continuous prompts exhibit a remarkable improvement compared to the zero-shot pre-trained model on downstream tasks. How to automate the continuous prompt design is an underexplored area, and a fundamental question arises, is manually designed deep prompt strategy optimal? To answer this question, we propose a method dubbed differentiable prompt learning (DPL). The DPL method is formulated as an optimization problem to automatically determine the optimal context length of the prompt to be added to each layer, where the objective is to maximize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications
MethodsContrastive Language-Image Pre-training
