LAPT: Label-driven Automated Prompt Tuning for OOD Detection with   Vision-Language Models

Yabin Zhang; Wenjie Zhu; Chenhang He; Lei Zhang

arXiv:2407.08966·cs.CV·July 15, 2024·1 cites

LAPT: Label-driven Automated Prompt Tuning for OOD Detection with Vision-Language Models

Yabin Zhang, Wenjie Zhu, Chenhang He, Lei Zhang

PDF

Open Access 2 Repos

TL;DR

LAPT introduces an automated prompt tuning method for vision-language models that enhances out-of-distribution detection, reduces manual effort, and improves robustness and accuracy across diverse OOD scenarios.

Contribution

The paper presents a novel automated prompt tuning framework that eliminates manual prompt engineering for OOD detection with vision-language models, using distribution-aware prompts and autonomous data collection.

Findings

01

LAPT outperforms manual prompts in OOD detection accuracy.

02

It improves ID classification and robustness to covariate shifts.

03

The method operates autonomously with only class names as input.

Abstract

Out-of-distribution (OOD) detection is crucial for model reliability, as it identifies samples from unknown classes and reduces errors due to unexpected inputs. Vision-Language Models (VLMs) such as CLIP are emerging as powerful tools for OOD detection by integrating multi-modal information. However, the practical application of such systems is challenged by manual prompt engineering, which demands domain expertise and is sensitive to linguistic nuances. In this paper, we introduce Label-driven Automated Prompt Tuning (LAPT), a novel approach to OOD detection that reduces the need for manual prompt engineering. We develop distribution-aware prompts with in-distribution (ID) class names and negative labels mined automatically. Training samples linked to these class labels are collected autonomously via image synthesis and retrieval methods, allowing for prompt learning without manual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Chemical Sensor Technologies

MethodsContrastive Language-Image Pre-training