Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis
Lei-lei Li, Jianwu Fang, Junbin Xiao, Shanmin Pang, Hongkai Yu, Chen Lv, Jianru Xue, and Tat-Seng Chua

TL;DR
This paper introduces Causal-VidSyn, a diffusion model that synthesizes egocentric traffic accident videos with causal understanding, aiding safety testing for autonomous vehicles by accurately modeling accident participants and their behaviors.
Contribution
It presents a novel diffusion-based approach with causal entity grounding using cause descriptions and driver fixations, along with a large driver gaze dataset for accident scenarios.
Findings
Causal-VidSyn outperforms existing models in video quality and causal sensitivity.
It effectively synthesizes accident videos from various inputs.
The approach enhances safety testing for autonomous driving systems.
Abstract
Egocentricly comprehending the causes and effects of car accidents is crucial for the safety of self-driving cars, and synthesizing causal-entity reflected accident videos can facilitate the capability test to respond to unaffordable accidents in reality. However, incorporating causal relations as seen in real-world videos into synthetic videos remains challenging. This work argues that precisely identifying the accident participants and capturing their related behaviors are of critical importance. In this regard, we propose a novel diffusion model, Causal-VidSyn, for synthesizing egocentric traffic accident videos. To enable causal entity grounding in video diffusion, Causal-VidSyn leverages the cause descriptions and driver fixations to identify the accident participants and behaviors, facilitated by accident reason answering and gaze-conditioned selection modules. To support…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
