Learning from True-False Labels via Multi-modal Prompt Retrieving
Zhongnian Li, Jinghao Xu, Peng Ying, Meng Wei, Xinzheng Xu

TL;DR
This paper introduces a new weakly supervised labeling setting called True-False Labels (TFLs) for VLMs, along with a Multi-modal Prompt Retrieving method, to improve label accuracy and task performance.
Contribution
It proposes the TFL setting and a convolutional-based MRP method, enhancing weak supervision and bridging VLM knowledge with target tasks.
Findings
TFL setting achieves high label accuracy with VLMs.
MRP method effectively utilizes multi-modal prompts.
Experimental results validate the approach's effectiveness.
Abstract
Pre-trained Vision-Language Models (VLMs) exhibit strong zero-shot classification abilities, demonstrating great potential for generating weakly supervised labels. Unfortunately, existing weakly supervised learning methods are short of ability in generating accurate labels via VLMs. In this paper, we propose a novel weakly supervised labeling setting, namely True-False Labels (TFLs) which can achieve high accuracy when generated by VLMs. The TFL indicates whether an instance belongs to the label, which is randomly and uniformly sampled from the candidate label set. Specifically, we theoretically derive a risk-consistent estimator to explore and utilize the conditional probability distribution information of TFLs. Besides, we propose a convolutional-based Multi-modal Prompt Retrieving (MRP) method to bridge the gap between the knowledge of VLMs and target learning tasks. Experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
