Anomaly Detection by Adapting a pre-trained Vision Language Model

Yuxuan Cai; Xinwei He; Dingkang Liang; Ao Tong; Xiang Bai

arXiv:2403.09493·cs.CV·March 15, 2024·2 cites

Anomaly Detection by Adapting a pre-trained Vision Language Model

Yuxuan Cai, Xinwei He, Dingkang Liang, Ao Tong, Xiang Bai

PDF

Open Access

TL;DR

This paper introduces CLIP-ADA, a unified anomaly detection framework leveraging pre-trained CLIP models with learnable prompts and region refinement, achieving state-of-the-art results across multiple industrial image datasets.

Contribution

The paper proposes a novel adaptation of CLIP for anomaly detection, incorporating learnable prompts and an anomaly region refinement strategy for improved localization and detection.

Findings

01

Achieves state-of-the-art 97.5/55.6 on MVTec-AD for detection/localization.

02

Achieves 89.3/33.1 on VisA for detection/localization.

03

Performs well with limited training data, demonstrating robustness.

Abstract

Recently, large vision and language models have shown their success when adapting them to many downstream tasks. In this paper, we present a unified framework named CLIP-ADA for Anomaly Detection by Adapting a pre-trained CLIP model. To this end, we make two important improvements: 1) To acquire unified anomaly detection across industrial images of multiple categories, we introduce the learnable prompt and propose to associate it with abnormal patterns through self-supervised learning. 2) To fully exploit the representation power of CLIP, we introduce an anomaly region refinement strategy to refine the localization quality. During testing, the anomalies are localized by directly calculating the similarity between the representation of the learnable prompt and the image. Comprehensive experiments demonstrate the superiority of our framework, e.g., we achieve the state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications

MethodsContrastive Language-Image Pre-training