KAnoCLIP: Zero-Shot Anomaly Detection through Knowledge-Driven Prompt   Learning and Enhanced Cross-Modal Integration

Chengyuan Li; Suyang Zhou; Jieping Kong; Lei Qi; Hui Xue

arXiv:2501.03786·cs.CV·January 8, 2025

KAnoCLIP: Zero-Shot Anomaly Detection through Knowledge-Driven Prompt Learning and Enhanced Cross-Modal Integration

Chengyuan Li, Suyang Zhou, Jieping Kong, Lei Qi, Hui Xue

PDF

Open Access

TL;DR

KAnoCLIP introduces a knowledge-driven prompt learning framework that enhances zero-shot anomaly detection by integrating general and image-specific knowledge, achieving state-of-the-art results in diverse datasets.

Contribution

This work proposes KAnoCLIP, a novel ZSAD framework that eliminates fixed prompts and improves pixel-level detection through knowledge-driven learning and advanced cross-modal fusion.

Findings

01

Achieves state-of-the-art performance on 12 datasets.

02

Outperforms existing methods in generalization.

03

Enhances pixel-level anomaly segmentation.

Abstract

Zero-shot anomaly detection (ZSAD) identifies anomalies without needing training samples from the target dataset, essential for scenarios with privacy concerns or limited data. Vision-language models like CLIP show potential in ZSAD but have limitations: relying on manually crafted fixed textual descriptions or anomaly prompts is time-consuming and prone to semantic ambiguity, and CLIP struggles with pixel-level anomaly segmentation, focusing more on global semantics than local details. To address these limitations, We introduce KAnoCLIP, a novel ZSAD framework that leverages vision-language models. KAnoCLIP combines general knowledge from a Large Language Model (GPT-3.5) and fine-grained, image-specific knowledge from a Visual Question Answering system (Llama3) via Knowledge-Driven Prompt Learning (KnPL). KnPL uses a knowledge-driven (KD) loss function to create learnable anomaly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Seismology and Earthquake Studies · Advanced Data Processing Techniques

MethodsSoftmax · Attention Is All You Need · ALIGN · Contrastive Language-Image Pre-training