ADSeeker: A Knowledge-Grounded Reasoning Framework for Industry Anomaly Detection and Reasoning
Kai Zhang, Zekai Zhang, Xihe Sun, Anpeng Wang, Jingmeng Nie, Qinghui Chen, Han Hao, Jianyuan Guo, Jinglin Zhang

TL;DR
ADSeeker is a knowledge-grounded reasoning framework that improves industry anomaly detection by integrating a curated visual knowledge base, a retrieval-augmented generation approach, and a large-scale anomaly dataset.
Contribution
It introduces a novel knowledge base, a retrieval framework, and a large dataset to enhance zero-shot anomaly detection in industrial inspection.
Findings
Achieves state-of-the-art zero-shot performance on benchmark datasets.
Effectively utilizes a curated visual knowledge base for anomaly understanding.
Demonstrates improved detection accuracy with the proposed framework.
Abstract
Automatic vision inspection holds significant importance in industry inspection. While multimodal large language models (MLLMs) exhibit strong language understanding capabilities and hold promise for this task, their performance remains significantly inferior to that of human experts. In this context, we identify two key challenges: (i) insufficient integration of anomaly detection (AD) knowledge during pre-training, and (ii) the lack of technically precise and context-aware language generation for anomaly reasoning. To address these issues, we propose ADSeeker, an anomaly task assistant designed to enhance inspection performance through knowledge-grounded reasoning. ADSeeker first leverages a curated visual document knowledge base, SEEK-M&V, which we construct to address the limitations of existing resources that rely solely on unstructured text. SEEK-M\&V includes semantic-rich…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
