VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models

Ying Cheng; Yu-Ho Lin; Min-Hung Chen; Fu-En Yang; Shang-Hong Lai

arXiv:2511.07299·cs.CV·December 15, 2025

VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models

Ying Cheng, Yu-Ho Lin, Min-Hung Chen, Fu-En Yang, Shang-Hong Lai

PDF

Open Access

TL;DR

VADER introduces a relation-aware large language model framework for video anomaly understanding, integrating object relations and causal reasoning to produce detailed, explainable descriptions of anomalous events in videos.

Contribution

The paper presents VADER, a novel LLM-driven framework that incorporates object relations and causal context for enhanced video anomaly understanding and explanation.

Findings

01

VADER outperforms existing methods on multiple VAU benchmarks.

02

It provides detailed, causally grounded descriptions of anomalies.

03

VADER effectively supports anomaly explanation and question answering.

Abstract

Video anomaly understanding (VAU) aims to provide detailed interpretation and semantic comprehension of anomalous events within videos, addressing limitations of traditional methods that focus solely on detecting and localizing anomalies. However, existing approaches often neglect the deeper causal relationships and interactions between objects, which are critical for understanding anomalous behaviors. In this paper, we propose VADER, an LLM-driven framework for Video Anomaly unDErstanding, which integrates keyframe object Relation features with visual cues to enhance anomaly comprehension from video. Specifically, VADER first applies an Anomaly Scorer to assign per-frame anomaly scores, followed by a Context-AwarE Sampling (CAES) strategy to capture the causal context of each anomalous event. A Relation Feature Extractor and a COntrastive Relation Encoder (CORE) jointly model dynamic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Human Pose and Action Recognition · Multimodal Machine Learning Applications