Auxiliary Descriptive Knowledge for Few-Shot Adaptation of Vision-Language Model
SuBeen Lee, GilHan Park, WonJun Moon, Hyun Seok Seong, Jae-Pil Heo

TL;DR
This paper introduces Auxiliary Descriptive Knowledge (ADK), a framework that enriches text representations with descriptive prompts generated by large language models, improving few-shot adaptation of vision-language models without added inference overhead.
Contribution
The paper proposes ADK, a novel, efficient method to incorporate rich, descriptive prompts into vision-language models, enhancing few-shot adaptation performance without increasing inference costs.
Findings
ADK improves performance of multiple PEFT methods across various tasks.
ADK achieves state-of-the-art results in few-shot vision-language adaptation.
The approach is parameter-free and plug-and-play, facilitating easy integration.
Abstract
Despite the impressive zero-shot capabilities of Vision-Language Models (VLMs), they often struggle in downstream tasks with distribution shifts from the pre-training data. Few-Shot Adaptation (FSA-VLM) has emerged as a key solution, typically using Parameter-Efficient Fine-Tuning (PEFT) to adapt models with minimal data. However, these PEFT methods are constrained by their reliance on fixed, handcrafted prompts, which are often insufficient to understand the semantics of classes. While some studies have proposed leveraging image-induced prompts to provide additional clues for classification, they introduce prohibitive computational overhead at inference. Therefore, we introduce Auxiliary Descriptive Knowledge (ADK), a novel framework that efficiently enriches text representations without compromising efficiency. ADK first leverages a Large Language Model to generate a rich set of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis
