Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles
Guodong Wang, Yunhong Wang, Jie Qin, Dongming Zhang, Xiuguo Bao, Di, Huang

TL;DR
This paper introduces a self-supervised video anomaly detection method using decoupled spatio-temporal jigsaw puzzles, which effectively captures appearance and motion features to distinguish normal from abnormal events.
Contribution
The novel approach decouples spatial and temporal puzzle solving, uses full permutations for varied difficulty, and operates end-to-end without pre-trained models, advancing VAD techniques.
Findings
Outperforms state-of-the-art on three benchmarks.
Achieves significant improvement on ShanghaiTech Campus.
Effectively captures discriminative spatio-temporal features.
Abstract
Video Anomaly Detection (VAD) is an important topic in computer vision. Motivated by the recent advances in self-supervised learning, this paper addresses VAD by solving an intuitive yet challenging pretext task, i.e., spatio-temporal jigsaw puzzles, which is cast as a multi-label fine-grained classification problem. Our method exhibits several advantages over existing works: 1) the spatio-temporal jigsaw puzzles are decoupled in terms of spatial and temporal dimensions, responsible for capturing highly discriminative appearance and motion features, respectively; 2) full permutations are used to provide abundant jigsaw puzzles covering various difficulty levels, allowing the network to distinguish subtle spatio-temporal differences between normal and abnormal events; and 3) the pretext task is tackled in an end-to-end manner without relying on any pre-trained models. Our method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Digital Media Forensic Detection · Human Pose and Action Recognition
MethodsJigsaw
