VL4AD: Vision-Language Models Improve Pixel-wise Anomaly Detection

Liangyu Zhong; Joachim Sicking; Fabian H\"uger; Hanno Gottschalk

arXiv:2409.17330·cs.CV·September 27, 2024

VL4AD: Vision-Language Models Improve Pixel-wise Anomaly Detection

Liangyu Zhong, Joachim Sicking, Fabian H\"uger, Hanno Gottschalk

PDF

Open Access

TL;DR

VL4AD leverages vision-language pre-training and a novel scoring method to enhance pixel-wise anomaly detection without additional data collection or retraining, showing competitive results on benchmarks.

Contribution

Introduces VL4AD, a novel approach integrating vision-language encoders into anomaly detection, with a new scoring function for outlier supervision without extra training.

Findings

01

Achieves competitive benchmark performance

02

Utilizes data- and training-free outlier supervision

03

Demonstrates effectiveness of vision-language models for anomaly detection

Abstract

Semantic segmentation networks have achieved significant success under the assumption of independent and identically distributed data. However, these networks often struggle to detect anomalies from unknown semantic classes due to the limited set of visual concepts they are typically trained on. To address this issue, anomaly segmentation often involves fine-tuning on outlier samples, necessitating additional efforts for data collection, labeling, and model retraining. Seeking to avoid this cumbersome work, we take a different approach and propose to incorporate Vision-Language (VL) encoders into existing anomaly detectors to leverage the semantically broad VL pre-training for improved outlier awareness. Additionally, we propose a new scoring function that enables data- and training-free outlier supervision via textual prompts. The resulting VL4AD model, which includes max-logit prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · COVID-19 diagnosis using AI · Advanced Image and Video Retrieval Techniques

MethodsSparse Evolutionary Training