A VLM-based Method for Visual Anomaly Detection in Robotic Scientific Laboratories

Shiwei Lin; Chenxu Wang; Xiaozhen Ding; Yi Wang; Boyuan Du; Lei Song; Chenggang Wang; and Huaping Liu

arXiv:2506.05405·cs.CV·April 21, 2026

A VLM-based Method for Visual Anomaly Detection in Robotic Scientific Laboratories

Shiwei Lin, Chenxu Wang, Xiaozhen Ding, Yi Wang, Boyuan Du, Lei Song, Chenggang Wang, and Huaping Liu

PDF

TL;DR

This paper introduces a vision-language model-based method for detecting visual anomalies in robotic scientific labs, demonstrating improved accuracy with more context and validating effectiveness through real-world tests.

Contribution

It presents a novel VLM-based reasoning approach with multiple supervision levels and a new benchmark for process anomaly detection in scientific workflows.

Findings

01

Detection accuracy improves with more contextual information.

02

The approach is effective and adaptable for process anomaly detection.

03

Real-world validation confirms the method's practical utility.

Abstract

In robot scientific laboratories, visual anomaly detection is important for the timely identification and resolution of potential faults or deviations. It has become a key factor in ensuring the stability and safety of experimental processes. To address this challenge, this paper proposes a VLM-based visual reasoning approach that supports different levels of supervision through four progressively informative prompt configurations. To systematically evaluate its effectiveness, we construct a visual benchmark tailored for process anomaly detection in scientific workflows. Experiments on two representative vision-language models show that detection accuracy improves as more contextual information is provided, confirming the effectiveness and adaptability of the proposed reasoning approach for process anomaly detection in scientific workflows. Furthermore, real-world validations at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.