PromptKD: Unsupervised Prompt Distillation for Vision-Language Models

Zheng Li; Xiang Li; Xinyi Fu; Xin Zhang; Weiqiang Wang; Shuo Chen,; Jian Yang

arXiv:2403.02781·cs.CV·August 14, 2024·2 cites

PromptKD: Unsupervised Prompt Distillation for Vision-Language Models

Zheng Li, Xiang Li, Xinyi Fu, Xin Zhang, Weiqiang Wang, Shuo Chen,, Jian Yang

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces an unsupervised prompt distillation framework for vision-language models like CLIP, enabling knowledge transfer from large teacher models to lightweight students using unlabeled domain images and prompt-driven imitation.

Contribution

It is the first to perform unsupervised domain-specific prompt distillation for CLIP and proposes a pre-storing mechanism for text features as shared class vectors.

Findings

01

Effective knowledge transfer on 11 datasets

02

Eliminates reliance on labeled data

03

Improves student model performance

Abstract

Prompt learning has emerged as a valuable technique in enhancing vision-language models (VLMs) such as CLIP for downstream tasks in specific domains. Existing work mainly focuses on designing various learning forms of prompts, neglecting the potential of prompts as effective distillers for learning from larger teacher models. In this paper, we introduce an unsupervised domain prompt distillation framework, which aims to transfer the knowledge of a larger teacher model to a lightweight target model through prompt-driven imitation using unlabeled domain images. Specifically, our framework consists of two distinct stages. In the initial stage, we pre-train a large CLIP teacher model using domain (few-shot) labels. After pre-training, we leverage the unique decoupled-modality characteristics of CLIP by pre-computing and storing the text features as class vectors only once through the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhengli97/promptkd
pytorchOfficial

Models

🤗
zhengli97/prompt_learning_dataset
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling

MethodsContrastive Language-Image Pre-training · Knowledge Distillation