Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors
Nicolae-Catalin Ristea, Florinel-Alin Croitoru, Radu Tudor Ionescu,, Marius Popescu, Fahad Shahbaz Khan, Mubarak Shah

TL;DR
This paper introduces a lightweight, self-distilled masked auto-encoder for efficient video anomaly detection, focusing on foreground motion, using dual decoders, and synthetic abnormal data to achieve high speed and competitive accuracy.
Contribution
The paper presents a novel, efficient video anomaly detection model that incorporates motion-gradient-based token weighting, a teacher-student decoder architecture, and synthetic abnormal event generation.
Findings
Achieves 1655 FPS processing speed.
Outperforms or matches state-of-the-art accuracy on four benchmarks.
Demonstrates significant speed improvement over existing methods.
Abstract
We propose an efficient abnormal event detection model based on a lightweight masked auto-encoder (AE) applied at the video frame level. The novelty of the proposed model is threefold. First, we introduce an approach to weight tokens based on motion gradients, thus shifting the focus from the static background scene to the foreground objects. Second, we integrate a teacher decoder and a student decoder into our architecture, leveraging the discrepancy between the outputs given by the two decoders to improve anomaly detection. Third, we generate synthetic abnormal events to augment the training videos, and task the masked AE model to jointly reconstruct the original frames (without anomalies) and the corresponding pixel-level anomaly maps. Our design leads to an efficient and effective model, as demonstrated by the extensive experiments carried out on four benchmarks: Avenue,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Artificial Immune Systems Applications
MethodsFocus · Autoencoders · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
