From Frames to Events: Rethinking Evaluation in Human-Centric Video Anomaly Detection

Narges Rashvand; Shanle Yao; Armin Danesh Pazho; Babak Rahimi Ardabili; Hamed Tabkhi

arXiv:2604.09327·cs.CV·April 13, 2026

From Frames to Events: Rethinking Evaluation in Human-Centric Video Anomaly Detection

Narges Rashvand, Shanle Yao, Armin Danesh Pazho, Babak Rahimi Ardabili, Hamed Tabkhi

PDF

1 Repo

TL;DR

This paper advocates for an event-centric evaluation approach in pose-based Video Anomaly Detection, highlighting the limitations of frame-level metrics and proposing new strategies and standards for better real-world applicability.

Contribution

It introduces an event-based evaluation standard for VAD, analyzes existing benchmarks, and proposes two strategies for temporal event localization.

Findings

01

State-of-the-art models perform poorly at event-level localization with precision below 10%.

02

Existing benchmarks are misaligned with real-world event detection needs.

03

The proposed methods improve event detection metrics but still reveal significant performance gaps.

Abstract

Pose-based Video Anomaly Detection (VAD) has gained significant attention for its privacy-preserving nature and robustness to environmental variations. However, traditional frame-level evaluations treat video as a collection of isolated frames, fundamentally misaligned with how anomalies manifest and are acted upon in the real world. In operational surveillance systems, what matters is not the flagging of individual frames, but the reliable detection, localization, and reporting of a coherent anomalous event, a contiguous temporal episode with an identifiable onset and duration. Frame-level metrics are blind to this distinction, and as a result, they systematically overestimate model performance for any deployment that requires actionable, event-level alerts. In this work, we propose a shift toward an event-centric perspective in VAD. We first audit widely used VAD benchmarks, including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TeCSAR-UNCC/EventCentric-VAD
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.