Rethinking Prompting Strategies for Multi-Label Recognition with Partial Annotations
Samyak Rawlekar, Shubhang Bhatnagar, Narendra Ahuja

TL;DR
This paper investigates prompt-learning strategies for multi-label recognition with partial annotations, revealing that focusing on positive prompts and learned negative embeddings enhances performance over dual prompt approaches.
Contribution
It introduces PositiveCoOp and NegativeCoOp, demonstrating that learning only positive prompts and replacing negative prompts with embeddings improves multi-label recognition.
Findings
Negative prompts degrade performance in partial annotation settings.
Learning only positive prompts with negative embeddings outperforms dual prompt methods.
Baseline vision features perform comparably to dual prompt approaches when label missing is low.
Abstract
Vision-language models (VLMs) like CLIP have been adapted for Multi-Label Recognition (MLR) with partial annotations by leveraging prompt-learning, where positive and negative prompts are learned for each class to associate their embeddings with class presence or absence in the shared vision-text feature space. While this approach improves MLR performance by relying on VLM priors, we hypothesize that learning negative prompts may be suboptimal, as the datasets used to train VLMs lack image-caption pairs explicitly focusing on class absence. To analyze the impact of positive and negative prompt learning on MLR, we introduce PositiveCoOp and NegativeCoOp, where only one prompt is learned with VLM guidance while the other is replaced by an embedding vector learned directly in the shared feature space without relying on the text encoder. Through empirical analysis, we observe that negative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
MethodsContrastive Language-Image Pre-training
