Benchmarking Compact VLMs for Clip-Level Surveillance Anomaly Detection Under Weak Supervision
Kirill Borodin, Kirill Kondrashov, Nikita Vasiliev, Ksenia Gladkova, Inna Larina, Mikhail Gorodnichev, Grach Mkrtchian

TL;DR
This paper evaluates compact vision-language models for clip-level surveillance anomaly detection, demonstrating that parameter-efficient fine-tuning achieves high accuracy and efficiency, outperforming some baselines under a standardized evaluation protocol.
Contribution
It introduces a unified evaluation protocol and shows that compact VLMs with parameter-efficient adaptation can serve as reliable, efficient anomaly detectors in surveillance settings.
Findings
Compact VLMs achieve comparable or better performance than baselines.
Parameter-efficient adaptation reduces prompt sensitivity.
Compact VLMs maintain low latency while ensuring detection accuracy.
Abstract
CCTV safety monitoring demands anomaly detectors combine reliable clip-level accuracy with predictable per-clip latency despite weak supervision. This work investigates compact vision-language models (VLMs) as practical detectors for this regime. A unified evaluation protocol standardizes preprocessing, prompting, dataset splits, metrics, and runtime settings to compare parameter-efficiently adapted compact VLMs against training-free VLM pipelines and weakly supervised baselines. Evaluation spans accuracy, precision, recall, F1, ROC-AUC, and average per-clip latency to jointly quantify detection quality and efficiency. With parameter-efficient adaptation, compact VLMs achieve performance on par with, and in several cases exceeding, established approaches while retaining competitive per-clip latency. Adaptation further reduces prompt sensitivity, producing more consistent behavior across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Data Stream Mining Techniques · Network Security and Intrusion Detection
