Recognition of Abnormal Events in Surveillance Videos using Weakly Supervised Dual-Encoder Models

Noam Tsfaty; Avishai Weizman; Liav Cohen; Moshe Tshuva; Yehudit Aperstein

arXiv:2511.13276·cs.CV·November 18, 2025

Recognition of Abnormal Events in Surveillance Videos using Weakly Supervised Dual-Encoder Models

Noam Tsfaty, Avishai Weizman, Liav Cohen, Moshe Tshuva, Yehudit Aperstein

PDF

Open Access

TL;DR

This paper introduces a dual-encoder framework that combines convolutional and transformer models with weak supervision to effectively detect anomalies in surveillance videos, achieving high accuracy on a challenging dataset.

Contribution

The paper proposes a novel weakly supervised dual-encoder model that integrates convolutional and transformer features for anomaly detection in videos.

Findings

01

Achieved 90.7% AUC on UCF-Crime dataset

02

Effectively detects rare and diverse anomalies with weak supervision

03

Combines convolutional and transformer representations for improved performance

Abstract

We address the challenge of detecting rare and diverse anomalies in surveillance videos using only video-level supervision. Our dual-backbone framework combines convolutional and transformer representations through top-k pooling, achieving 90.7% area under the curve (AUC) on the UCF-Crime dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Human Pose and Action Recognition · Video Analysis and Summarization