CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No
Hualiang Wang, Yi Li, Huifeng Yao, Xiaomeng Li

TL;DR
This paper introduces CLIPN, a novel zero-shot OOD detection method that enhances CLIP with negation semantics and threshold-free algorithms, significantly improving OOD detection performance across multiple datasets.
Contribution
We propose CLIPN, which equips CLIP with negation semantics through learnable prompts and new loss functions, enabling effective zero-shot OOD detection without thresholds.
Findings
CLIPN outperforms 7 baseline algorithms by at least 2.34% AUROC and 11.64% FPR95.
CLIPN achieves state-of-the-art results on 9 benchmark datasets for zero-shot OOD detection.
The method leverages negation semantics to distinguish in-distribution and out-of-distribution samples effectively.
Abstract
Out-of-distribution (OOD) detection refers to training the model on an in-distribution (ID) dataset to classify whether the input images come from unknown classes. Considerable effort has been invested in designing various OOD detection methods based on either convolutional neural networks or transformers. However, zero-shot OOD detection methods driven by CLIP, which only require class names for ID, have received less attention. This paper presents a novel method, namely CLIP saying no (CLIPN), which empowers the logic of saying no within CLIP. Our key motivation is to equip CLIP with the capability of distinguishing OOD and ID samples using positive-semantic prompts and negation-semantic prompts. Specifically, we design a novel learnable no prompt and a no text encoder to capture negation semantics within images. Subsequently, we introduce two loss functions: the image-text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsContrastive Language-Image Pre-training
