AgentIAD: Agentic Industrial Anomaly Detection via Adaptive Memory Augmentation
Junwen Miao, Penghui Du, Yingying Fan, Yi Liu, Yu Wang, Runze He, Lida Huang, Yan Wang

TL;DR
AgentIAD introduces an iterative, memory-augmented vision--language framework for industrial anomaly detection, enabling active evidence gathering and improved accuracy over state-of-the-art methods.
Contribution
It presents a novel agentic inspection approach with dynamic memory access and a two-stage training strategy for enhanced anomaly detection.
Findings
Improves classification accuracy by 5.92% on MMAD benchmark.
Enables multi-round reasoning for more reliable anomaly analysis.
Outperforms previous state-of-the-art methods with the same backbone.
Abstract
Industrial anomaly detection (IAD) is challenging due to the subtle and highly localized nature of many defects, which single-pass vision--language models (VLMs) often fail to capture. Moreover, existing approaches lack mechanisms to actively acquire complementary evidence during inference. We propose AgentIAD, an agentic vision--language framework that enables iterative industrial inspection through a unified action space. The agent dynamically accesses two forms of memory during inspection: visual memory via the Perceptive Zoomer (PZ) for fine-grained local analysis, and retrieved memory via the Web Searcher (WS) and Comparative Retriever (CR) for external knowledge acquisition and cross-instance verification. This design allows the model to progressively gather evidence through multi-round perception--action reasoning. To effectively learn such policies under sparse supervision,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
