Cascading multi-agent anomaly detection in surveillance systems via vision-language models and embedding-based classification
Tayyab Rehman, Giovanni De Gasperis, Aly Shmahell

TL;DR
This paper presents a multi-agent cascade framework that combines reconstruction, object detection, and vision-language reasoning to improve anomaly detection in surveillance, achieving faster response times and high interpretability.
Contribution
It introduces a novel multi-agent cascade architecture that unifies different paradigms for efficient, interpretable anomaly detection in visual surveillance systems.
Findings
Achieves threefold reduction in latency compared to direct vision-language inference.
Maintains high perceptual quality with PSNR of 38.3 dB and SSIM of 0.965.
Demonstrates scalable deployment and high semantic labeling accuracy.
Abstract
Intelligent anomaly detection in dynamic visual environments requires reconciling real-time performance with semantic interpretability. Conventional approaches address only fragments of this challenge. Reconstruction-based models capture low-level deviations without contextual reasoning, object detectors provide speed but limited semantics, and large vision-language systems deliver interpretability at prohibitive computational cost. This work introduces a cascading multi-agent framework that unifies these complementary paradigms into a coherent and interpretable architecture. Early modules perform reconstruction-gated filtering and object-level assessment, while higher-level reasoning agents are selectively invoked to interpret semantically ambiguous events. The system employs adaptive escalation thresholds and a publish-subscribe communication backbone, enabling asynchronous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Human Pose and Action Recognition
