LogicAD: Explainable Anomaly Detection via VLM-based Text Feature   Extraction

Er Jin; Qihui Feng; Yongli Mou; Stefan Decker; Gerhard Lakemeyer,; Oliver Simons; Johannes Stegmaier

arXiv:2501.01767·cs.CV·January 9, 2025

LogicAD: Explainable Anomaly Detection via VLM-based Text Feature Extraction

Er Jin, Qihui Feng, Yongli Mou, Stefan Decker, Gerhard Lakemeyer,, Oliver Simons, Johannes Stegmaier

PDF

1 Video

TL;DR

This paper introduces LogicAD, a novel approach using Vision Language Models combined with logic reasoning to detect anomalies in images, providing explainable results and achieving state-of-the-art performance on benchmark datasets.

Contribution

The paper demonstrates the effectiveness of AVLMs for logical anomaly detection and introduces a method that combines format embedding and a logic reasoner for improved interpretability and accuracy.

Findings

01

Achieves SOTA AUROC of 86.0% on MVTec LOCO AD

02

F1-max score of 83.7%, outperforming previous methods

03

Provides explainable anomaly detection results

Abstract

Logical image understanding involves interpreting and reasoning about the relationships and consistency within an image's visual content. This capability is essential in applications such as industrial inspection, where logical anomaly detection is critical for maintaining high-quality standards and minimizing costly recalls. Previous research in anomaly detection (AD) has relied on prior knowledge for designing algorithms, which often requires extensive manual annotations, significant computing power, and large amounts of data for training. Autoregressive, multimodal Vision Language Models (AVLMs) offer a promising alternative due to their exceptional performance in visual reasoning across various domains. Despite this, their application to logical AD remains unexplored. In this work, we investigate using AVLMs for logical AD and demonstrate that they are well-suited to the task.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LogicAD: Explainable Anomaly Detection via VLM-based Text Feature Extraction· underline