Beyond the Benchmark: Detecting Diverse Anomalies in Videos

Yoav Arad; Michael Werman

arXiv:2310.01904·cs.CV·October 4, 2023

Beyond the Benchmark: Detecting Diverse Anomalies in Videos

Yoav Arad, Michael Werman

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces new datasets and a novel multi-frame anomaly detection method to identify diverse and complex anomalies in videos, surpassing traditional single-frame benchmark limitations.

Contribution

The paper presents HMDB-AD and HMDB-Violence datasets for diverse anomalies and proposes MFAD, a multi-frame detection method enhancing video anomaly detection capabilities.

Findings

01

MFAD outperforms existing models on new anomaly types

02

Datasets challenge models with complex, action-based anomalies

03

Experimental results validate the effectiveness of multi-frame features

Abstract

Video Anomaly Detection (VAD) plays a crucial role in modern surveillance systems, aiming to identify various anomalies in real-world situations. However, current benchmark datasets predominantly emphasize simple, single-frame anomalies such as novel object detection. This narrow focus restricts the advancement of VAD models. In this research, we advocate for an expansion of VAD investigations to encompass intricate anomalies that extend beyond conventional benchmark boundaries. To facilitate this, we introduce two datasets, HMDB-AD and HMDB-Violence, to challenge models with diverse action-based anomalies. These datasets are derived from the HMDB51 action recognition dataset. We further present Multi-Frame Anomaly Detection (MFAD), a novel method built upon the AI-VAD framework. AI-VAD utilizes single-frame features such as pose estimation and deep image encoding, and two-frame…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

Strengths of the MFAD approach: ++ Comprehensive Feature Extraction: MFAD extracts four diverse feature types, including object velocities, human pose estimations, deep image encodings, and deep video encodings, enabling a holistic analysis of video data. ++ Adaptive Density Score Calculation: Using Gaussian Mixture Models (GMM) for velocity features and k-nearest neighbors (kNN) for other high-dimensional features, it adapts the density score calculation to the nature of the features, enhanci

Weaknesses

Looking at the manuscript, weaknesses are provided below. -- Complexity: MFAD's multi-stage process and diverse feature extraction can make it computationally demanding and challenging to implement in resource-constrained environments. -- Model Specificity: Utilizing specific video foundation models may reduce adaptability to different datasets or domains. -- Not Real-Time: Computationally intensive and a requirement for separate training/testing data make real-time application challenging.

Reviewer 02Rating 3· reject, not good enoughConfidence 5

Strengths

(1) The paper addresses the limitation of current benchmark datasets for video anomaly detection and proposes two new datasets that allow for the detection of complex action-based anomalies. This expands the scope of what constitutes an anomaly and encourages further research on more comprehensive anomaly types. (2) The proposed method, MFAD, simply incorporates deep video encoding features and logistic regression to effectively detect both simple and complex anomalies. The experimental results

Weaknesses

(1) The paper lacks a more detailed description of the datasets HMDB-AD and HMDB-Violence. It would be beneficial to provide more information on the distribution of normal and abnormal activities, and any specific challenges or characteristics of the datasets. (2) The method proposed in this paper is more like a simple patchwork combination that lacks sound and rigorous theoretical support. Moreover, the paper lacks a more detailed and visual explanation of the proposed method. (3) The article l

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 5

Strengths

-Good overview of existing methods and datasets used in anomaly detection. -"Introducing" new videos to video anomaly detection for benchmarking. -Comprehensive and competitive results on the public benchmarks and significantly higher results compared to state-of-the-art on the two subsets of HMDB51. -Proper ablation study. which also shows the effect of video encoding features in Table 4.

Weaknesses

Despite the interesting results, the paper's method sounds like a simple extension of [1] by introducing temporal features to [1]. Though the paper has cherry picked videos from HMDB51 and suggests using them for anomaly detection, they need claim this as their data (Table 1), which is not correct. The abolition study shows that video encoder features alone are producing almost similar results with the entire set of features on the subsets of HMDB51, so what is the point in inclusion of other fe

Code & Models

Repositories

yoavarad/mfad
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Human Pose and Action Recognition · Network Security and Intrusion Detection

MethodsFocus · Logistic Regression