Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

Lei-lei Li; Jianwu Fang; Junbin Xiao; Shanmin Pang; Hongkai Yu; Chen Lv; Jianru Xue; and Tat-Seng Chua

arXiv:2506.23263·cs.CV·July 1, 2025

Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

Lei-lei Li, Jianwu Fang, Junbin Xiao, Shanmin Pang, Hongkai Yu, Chen Lv, Jianru Xue, and Tat-Seng Chua

PDF

TL;DR

This paper introduces Causal-VidSyn, a diffusion model that synthesizes egocentric traffic accident videos with causal understanding, aiding safety testing for autonomous vehicles by accurately modeling accident participants and their behaviors.

Contribution

It presents a novel diffusion-based approach with causal entity grounding using cause descriptions and driver fixations, along with a large driver gaze dataset for accident scenarios.

Findings

01

Causal-VidSyn outperforms existing models in video quality and causal sensitivity.

02

It effectively synthesizes accident videos from various inputs.

03

The approach enhances safety testing for autonomous driving systems.

Abstract

Egocentricly comprehending the causes and effects of car accidents is crucial for the safety of self-driving cars, and synthesizing causal-entity reflected accident videos can facilitate the capability test to respond to unaffordable accidents in reality. However, incorporating causal relations as seen in real-world videos into synthetic videos remains challenging. This work argues that precisely identifying the accident participants and capturing their related behaviors are of critical importance. In this regard, we propose a novel diffusion model, Causal-VidSyn, for synthesizing egocentric traffic accident videos. To enable causal entity grounding in video diffusion, Causal-VidSyn leverages the cause descriptions and driver fixations to identify the accident participants and behaviors, facilitated by accident reason answering and gaze-conditioned selection modules. To support…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.