Beyond General Prompts: Automated Prompt Refinement using Contrastive Class Alignment Scores for Disambiguating Objects in Vision-Language Models
Lucas Choi, Ross Greer

TL;DR
This paper presents an automated prompt refinement method for vision-language models using a novel Contrastive Class Alignment Score, improving object detection accuracy by selecting semantically aligned prompts without extra training.
Contribution
Introduces CCAS, a new metric for automatic prompt refinement that enhances VLM object detection performance without additional training or labeled data.
Findings
CCAS effectively ranks prompts based on semantic alignment.
Automatic prompt selection improves detection accuracy.
Method is scalable and model-agnostic.
Abstract
Vision-language models (VLMs) offer flexible object detection through natural language prompts but suffer from performance variability depending on prompt phrasing. In this paper, we introduce a method for automated prompt refinement using a novel metric called the Contrastive Class Alignment Score (CCAS), which ranks prompts based on their semantic alignment with a target object class while penalizing similarity to confounding classes. Our method generates diverse prompt candidates via a large language model and filters them through CCAS, computed using prompt embeddings from a sentence transformer. We evaluate our approach on challenging object categories, demonstrating that our automatic selection of high-precision prompts improves object detection accuracy without the need for additional model training or labeled data. This scalable and model-agnostic pipeline offers a principled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Semantic Web and Ontologies
