VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning

Liyun Zhu; Qixiang Chen; Xi Shen; Xiaodong Cun

arXiv:2505.23504·cs.CV·May 30, 2025

VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning

Liyun Zhu, Qixiang Chen, Xi Shen, Xiaodong Cun

PDF

1 Repo 1 Datasets

TL;DR

VAU-R1 introduces a reinforcement fine-tuning framework for multimodal large language models to improve video anomaly reasoning, complemented by a new benchmark for evaluating interpretability and reasoning in anomaly detection.

Contribution

It presents VAU-R1, a novel data-efficient fine-tuning approach for enhanced anomaly reasoning, and VAU-Bench, the first comprehensive benchmark for video anomaly reasoning evaluation.

Findings

01

VAU-R1 improves question answering accuracy in anomaly scenarios.

02

VAU-R1 enhances temporal grounding and reasoning coherence.

03

The benchmark enables systematic evaluation of interpretability in video anomaly understanding.

Abstract

Video Anomaly Understanding (VAU) is essential for applications such as smart cities, security surveillance, and disaster alert systems, yet remains challenging due to its demand for fine-grained spatio-temporal perception and robust reasoning under ambiguity. Despite advances in anomaly detection, existing methods often lack interpretability and struggle to capture the causal and contextual aspects of abnormal events. This limitation is further compounded by the absence of comprehensive benchmarks for evaluating reasoning ability in anomaly scenarios. To address both challenges, we introduce VAU-R1, a data-efficient framework built upon Multimodal Large Language Models (MLLMs), which enhances anomaly reasoning through Reinforcement Fine-Tuning (RFT). Besides, we propose VAU-Bench, the first Chain-of-Thought benchmark tailored for video anomaly reasoning, featuring multiple-choice QA,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gvclab/vau-r1
pytorchOfficial

Datasets

7xiang/VAU-Bench
dataset· 30 dl
30 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.