Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM
Huaxin Zhang, Xiaohao Xu, Xiang Wang, Jialong Zuo, Chuchu Han, Xiaonan, Huang, Changxin Gao, Yuehuan Wang, Nong Sang

TL;DR
Holmes-VAD introduces a multimodal, explainable video anomaly detection framework that leverages large-scale instruction tuning and rich annotations to improve bias mitigation and interpretability in detecting challenging or unseen events.
Contribution
The paper presents the first large-scale multimodal VAD instruction-tuning benchmark and a novel framework that enhances unbiasedness and interpretability in video anomaly detection.
Findings
Holmes-VAD achieves improved anomaly localization accuracy.
The framework provides comprehensive explanations for detections.
Benchmark and model are publicly available for community use.
Abstract
Towards open-ended Video Anomaly Detection (VAD), existing methods often exhibit biased detection when faced with challenging or unseen events and lack interpretability. To address these drawbacks, we propose Holmes-VAD, a novel framework that leverages precise temporal supervision and rich multimodal instructions to enable accurate anomaly localization and comprehensive explanations. Firstly, towards unbiased and explainable VAD system, we construct the first large-scale multimodal VAD instruction-tuning benchmark, i.e., VAD-Instruct50k. This dataset is created using a carefully designed semi-automatic labeling paradigm. Efficient single-frame annotations are applied to the collected untrimmed videos, which are then synthesized into high-quality analyses of both abnormal and normal video clips using a robust off-the-shelf video captioner and a large language model (LLM). Building upon…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Digital Media Forensic Detection
