Recognition of Abnormal Events in Surveillance Videos using Weakly Supervised Dual-Encoder Models
Noam Tsfaty, Avishai Weizman, Liav Cohen, Moshe Tshuva, Yehudit Aperstein

TL;DR
This paper introduces a dual-encoder framework that combines convolutional and transformer models with weak supervision to effectively detect anomalies in surveillance videos, achieving high accuracy on a challenging dataset.
Contribution
The paper proposes a novel weakly supervised dual-encoder model that integrates convolutional and transformer features for anomaly detection in videos.
Findings
Achieved 90.7% AUC on UCF-Crime dataset
Effectively detects rare and diverse anomalies with weak supervision
Combines convolutional and transformer representations for improved performance
Abstract
We address the challenge of detecting rare and diverse anomalies in surveillance videos using only video-level supervision. Our dual-backbone framework combines convolutional and transformer representations through top-k pooling, achieving 90.7% area under the curve (AUC) on the UCF-Crime dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Human Pose and Action Recognition · Video Analysis and Summarization
