Anomaly Detection by Adapting a pre-trained Vision Language Model
Yuxuan Cai, Xinwei He, Dingkang Liang, Ao Tong, Xiang Bai

TL;DR
This paper introduces CLIP-ADA, a unified anomaly detection framework leveraging pre-trained CLIP models with learnable prompts and region refinement, achieving state-of-the-art results across multiple industrial image datasets.
Contribution
The paper proposes a novel adaptation of CLIP for anomaly detection, incorporating learnable prompts and an anomaly region refinement strategy for improved localization and detection.
Findings
Achieves state-of-the-art 97.5/55.6 on MVTec-AD for detection/localization.
Achieves 89.3/33.1 on VisA for detection/localization.
Performs well with limited training data, demonstrating robustness.
Abstract
Recently, large vision and language models have shown their success when adapting them to many downstream tasks. In this paper, we present a unified framework named CLIP-ADA for Anomaly Detection by Adapting a pre-trained CLIP model. To this end, we make two important improvements: 1) To acquire unified anomaly detection across industrial images of multiple categories, we introduce the learnable prompt and propose to associate it with abnormal patterns through self-supervised learning. 2) To fully exploit the representation power of CLIP, we introduce an anomaly region refinement strategy to refine the localization quality. During testing, the anomalies are localized by directly calculating the similarity between the representation of the learnable prompt and the image. Comprehensive experiments demonstrate the superiority of our framework, e.g., we achieve the state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
MethodsContrastive Language-Image Pre-training
