Chain-of-Anomaly Thoughts with Large Vision-Language Models
Pedro Domingos, Jo\~ao Pereira, Vasco Lopes, Jo\~ao Neves, David Semedo

TL;DR
This paper introduces Chain-of-Anomaly-Thoughts (CoAT), a multi-agent reasoning framework that enhances large vision-language models' ability to detect anomalies and crimes in video surveillance by incorporating inductive criminal biases.
Contribution
The paper proposes CoAT, a novel multi-agent reasoning approach that integrates inductive anomaly biases into vision-language models for improved crime and anomaly detection.
Findings
Boosted F1-score by 11.8 percentage points on low-resolution footage
Improved anomaly classification by 3.78 percentage points on high-resolution videos
Significantly enhanced detection performance in challenging surveillance scenarios
Abstract
Automated video surveillance with Large Vision-Language Models is limited by their inherent bias towards normality, often failing to detect crimes. While Chain-of-Thought reasoning strategies show significant potential for improving performance in language tasks, the lack of inductive anomaly biases in their reasoning further steers the models towards normal interpretations. To address this, we propose Chain-of-Anomaly-Thoughts (CoAT), a multi-agent reasoning framework that introduces inductive criminal bias in the reasoning process through a final, anomaly-focused classification layer. Our method significantly improves Anomaly Detection, boosting F1-score by 11.8 p.p. on challenging low-resolution footage and Anomaly Classification by 3.78 p.p. in high-resolution videos.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI) · Human Pose and Action Recognition
