A Retrospect to Multi-prompt Learning across Vision and Language
Ziliang Chen, Xin Huang, Quanlong Guan, Liang Lin, Weiqi Luo

TL;DR
This paper explores multi-prompt learning in vision-language models, proposing an energy-based method to generate multiple prompts that improve adaptation and generalization across tasks and domains.
Contribution
It introduces EMPL, a novel energy-based multi-prompt learning approach, and provides a theoretical and empirical analysis of multi-prompt transfer benefits.
Findings
EMPL is parameter-efficient and enhances open-vocabulary generalization.
Multi-prompt augmentation outperforms single-prompt methods.
Theoretical analysis supports the superiority of multi-prompt learning.
Abstract
The vision community is undergoing the unprecedented progress with the emergence of Vision-Language Pretraining Models (VLMs). Prompt learning plays as the holy grail of accessing VLMs since it enables their fast adaptation to downstream tasks with limited resources. Whereas existing researches milling around single-prompt paradigms, rarely investigate the technical potential behind their multi-prompt learning counterparts. This paper aims to provide a principled retrospect for vision-language multi-prompt learning. We extend the recent constant modality gap phenomenon to learnable prompts and then, justify the superiority of vision-language transfer with multi-prompt augmentation, empirically and theoretically. In terms of this observation, we propose an Energy-based Multi-prompt Learning (EMPL) to generate multiple prompt embeddings by drawing instances from an energy-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques
