FineVAU: A Novel Human-Aligned Benchmark for Fine-Grained Video Anomaly Understanding

Jo\~ao Pereira; Vasco Lopes; Jo\~ao Neves; David Semedo

arXiv:2601.17258·cs.CV·February 24, 2026

FineVAU: A Novel Human-Aligned Benchmark for Fine-Grained Video Anomaly Understanding

Jo\~ao Pereira, Vasco Lopes, Jo\~ao Neves, David Semedo

PDF

Open Access

TL;DR

This paper introduces FineVAU, a new benchmark for detailed video anomaly understanding, featuring a novel evaluation metric and dataset to better assess human-aligned, fine-grained analysis of unusual video events.

Contribution

The paper proposes a comprehensive benchmark with a new human-aligned evaluation metric and a high-quality dataset to improve fine-grained video anomaly understanding.

Findings

01

FVScore aligns better with human perception than existing metrics.

02

LVLMs struggle with spatial and temporal details in anomalies.

03

FineVAU reveals limitations of current models in detailed anomaly comprehension.

Abstract

Video Anomaly Understanding (VAU) is a novel task focused on describing unusual occurrences in videos. Despite growing interest, the evaluation of VAU remains an open challenge. Existing benchmarks rely on n-gram-based metrics (e.g., BLEU, ROUGE-L) or LLM-based evaluation. The first fails to capture the rich, free-form, and visually grounded nature of LVLM responses, while the latter focuses on assessing language quality over factual relevance, often resulting in subjective judgments that are misaligned with human perception. In this work, we address this issue by proposing FineVAU, a new benchmark for VAU that shifts the focus towards rich, fine-grained and domain-specific understanding of anomalous videos. We formulate VAU as a three-fold problem, with the goal of comprehensively understanding key descriptive elements of anomalies in video: events (What), participating entities (Who)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Human Pose and Action Recognition · Video Analysis and Summarization