Patch Spatio-Temporal Relation Prediction for Video Anomaly Detection

Hao Shen; Lu Shi; Wanru Xu; Yigang Cen; Linna Zhang; Gaoyun An

arXiv:2403.19111·cs.CV·March 29, 2024·1 cites

Patch Spatio-Temporal Relation Prediction for Video Anomaly Detection

Hao Shen, Lu Shi, Wanru Xu, Yigang Cen, Linna Zhang, Gaoyun An

PDF

Open Access

TL;DR

This paper introduces a self-supervised vision transformer approach for video anomaly detection that models spatial and temporal relationships between patches, significantly improving detection accuracy over existing methods.

Contribution

A novel two-branch transformer network that decouples inter-patch similarity and order prediction for enhanced video anomaly detection.

Findings

01

Outperforms pixel-generation-based methods on three benchmarks.

02

Surpasses other self-supervised learning approaches.

03

Effectively models spatial and temporal coherence in videos.

Abstract

Video Anomaly Detection (VAD), aiming to identify abnormalities within a specific context and timeframe, is crucial for intelligent Video Surveillance Systems. While recent deep learning-based VAD models have shown promising results by generating high-resolution frames, they often lack competence in preserving detailed spatial and temporal coherence in video frames. To tackle this issue, we propose a self-supervised learning approach for VAD through an inter-patch relationship prediction task. Specifically, we introduce a two-branch vision transformer network designed to capture deep visual features of video frames, addressing spatial and temporal dimensions responsible for modeling appearance and motion patterns, respectively. The inter-patch relationship in each dimension is decoupled into inter-patch similarity and the order information of each patch. To mitigate memory consumption,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Artificial Immune Systems Applications

MethodsAttention Is All You Need · Linear Layer · Residual Connection · Softmax · Multi-Head Attention · Dense Connections · Layer Normalization · Vision Transformer