Memory-augmented Online Video Anomaly Detection
Leonardo Rossi, Vittorio Bernuzzi, Tomaso Fontanini, Massimo Bertozzi, Andrea Prati

TL;DR
This paper introduces MOVAD, a modular, end-to-end system for real-time video anomaly detection in autonomous vehicle scenarios, combining short-term and long-term memory modules to improve detection accuracy.
Contribution
The paper presents a novel online video anomaly detection architecture that integrates short-term and long-term memory modules, achieving state-of-the-art performance with a simple, end-to-end trainable design.
Findings
Achieved 82.17% AUC on DoTA dataset, surpassing previous methods.
Demonstrated the effectiveness of combining VST and LSTM modules for anomaly detection.
Provided a modular architecture that is easy to implement and adapt.
Abstract
The ability to understand the surrounding scene is of paramount importance for Autonomous Vehicles (AVs). This paper presents a system capable to work in an online fashion, giving an immediate response to the arise of anomalies surrounding the AV, exploiting only the videos captured by a dash-mounted camera. Our architecture, called MOVAD, relies on two main modules: a Short-Term Memory Module to extract information related to the ongoing action, implemented by a Video Swin Transformer (VST), and a Long-Term Memory Module injected inside the classifier that considers also remote past information and action context thanks to the use of a Long-Short Term Memory (LSTM) network. The strengths of MOVAD are not only linked to its excellent performance, but also to its straightforward and modular architecture, trained in a end-to-end fashion with only RGB frames with as less assumptions as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods · Network Security and Intrusion Detection
MethodsAttention Is All You Need · Linear Layer · Label Smoothing · Dense Connections · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Multi-Head Attention · Dropout · Residual Connection
