UP-DP: Unsupervised Prompt Learning for Data Pre-Selection with   Vision-Language Models

Xin Li; Sima Behpour; Thang Doan; Wenbin He; Liang Gou; Liu Ren

arXiv:2307.11227·cs.CV·July 24, 2023·2 cites

UP-DP: Unsupervised Prompt Learning for Data Pre-Selection with Vision-Language Models

Xin Li, Sima Behpour, Thang Doan, Wenbin He, Liang Gou, Liu Ren

PDF

Open Access 1 Video

TL;DR

This paper introduces UP-DP, an unsupervised prompt learning method that enhances vision-language models for data pre-selection, leading to better representation and significant performance improvements across multiple datasets.

Contribution

It is the first to incorporate unsupervised prompt learning into vision-language models for data pre-selection, improving dataset representation and generalizability.

Findings

01

Achieves up to 20% performance gain over state-of-the-art methods.

02

Prompts learned from one dataset generalize well to others.

03

Joint vision-text features outperform visual-only features in data pre-selection.

Abstract

In this study, we investigate the task of data pre-selection, which aims to select instances for labeling from an unlabeled dataset through a single pass, thereby optimizing performance for undefined downstream tasks with a limited annotation budget. Previous approaches to data pre-selection relied solely on visual features extracted from foundation models, such as CLIP and BLIP-2, but largely ignored the powerfulness of text features. In this work, we argue that, with proper design, the joint feature space of both vision and text can yield a better representation for data pre-selection. To this end, we introduce UP-DP, a simple yet effective unsupervised prompt learning approach that adapts vision-language models, like BLIP-2, for data pre-selection. Specifically, with the BLIP-2 parameters frozen, we train text prompts to extract the joint features with improved representation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

UP-DP: Unsupervised Prompt Learning for Data Pre-Selection with Vision-Language Models· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Text and Document Classification Technologies · Topic Modeling

MethodsContrastive Language-Image Pre-training