Learning Event Completeness for Weakly Supervised Video Anomaly Detection
Yu Wang, Shiwei Chen

TL;DR
This paper introduces LEC-VAD, a novel weakly supervised video anomaly detection method that leverages semantic regularities and memory bank-based prototype learning to improve event localization and description, outperforming existing methods.
Contribution
The paper proposes a dual-structure model with semantic regularities and a memory bank mechanism to enhance event completeness and text expressiveness in WS-VAD.
Findings
Outperforms state-of-the-art on XD-Violence and UCF-Crime datasets
Uses anomaly-aware Gaussian mixture for precise event boundaries
Employs memory bank-based prototype learning to enrich text descriptions
Abstract
Weakly supervised video anomaly detection (WS-VAD) is tasked with pinpointing temporal intervals containing anomalous events within untrimmed videos, utilizing only video-level annotations. However, a significant challenge arises due to the absence of dense frame-level annotations, often leading to incomplete localization in existing WS-VAD methods. To address this issue, we present a novel LEC-VAD, Learning Event Completeness for Weakly Supervised Video Anomaly Detection, which features a dual structure designed to encode both category-aware and category-agnostic semantics between vision and language. Within LEC-VAD, we devise semantic regularities that leverage an anomaly-aware Gaussian mixture to learn precise event boundaries, thereby yielding more complete event instances. Besides, we develop a novel memory bank-based prototype learning mechanism to enrich concise text descriptions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Human Pose and Action Recognition
