LAPT: Label-driven Automated Prompt Tuning for OOD Detection with Vision-Language Models
Yabin Zhang, Wenjie Zhu, Chenhang He, Lei Zhang

TL;DR
LAPT introduces an automated prompt tuning method for vision-language models that enhances out-of-distribution detection, reduces manual effort, and improves robustness and accuracy across diverse OOD scenarios.
Contribution
The paper presents a novel automated prompt tuning framework that eliminates manual prompt engineering for OOD detection with vision-language models, using distribution-aware prompts and autonomous data collection.
Findings
LAPT outperforms manual prompts in OOD detection accuracy.
It improves ID classification and robustness to covariate shifts.
The method operates autonomously with only class names as input.
Abstract
Out-of-distribution (OOD) detection is crucial for model reliability, as it identifies samples from unknown classes and reduces errors due to unexpected inputs. Vision-Language Models (VLMs) such as CLIP are emerging as powerful tools for OOD detection by integrating multi-modal information. However, the practical application of such systems is challenged by manual prompt engineering, which demands domain expertise and is sensitive to linguistic nuances. In this paper, we introduce Label-driven Automated Prompt Tuning (LAPT), a novel approach to OOD detection that reduces the need for manual prompt engineering. We develop distribution-aware prompts with in-distribution (ID) class names and negative labels mined automatically. Training samples linked to these class labels are collected autonomously via image synthesis and retrieval methods, allowing for prompt learning without manual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Chemical Sensor Technologies
MethodsContrastive Language-Image Pre-training
