Mirage: One-Step Video Diffusion for Photorealistic and Coherent Asset Editing in Driving Scenes
Shuyun Wang, Haiyang Sun, Bing Wang, Hangjun Ye, Xin Yu

TL;DR
Mirage is a novel one-step video diffusion model that enables photorealistic, temporally coherent asset editing in driving scenes, addressing fidelity and consistency challenges in autonomous driving data augmentation.
Contribution
We introduce Mirage, a one-step video diffusion approach with a new latent injection and data alignment strategy for improved asset editing in driving videos.
Findings
Achieves high realism and temporal coherence in editing scenarios
Outperforms existing methods in visual fidelity and consistency
Generalizes well to other video translation tasks
Abstract
Vision-centric autonomous driving systems rely on diverse and scalable training data to achieve robust performance. While video object editing offers a promising path for data augmentation, existing methods often struggle to maintain both high visual fidelity and temporal coherence. In this work, we propose \textbf{Mirage}, a one-step video diffusion model for photorealistic and coherent asset editing in driving scenes. Mirage builds upon a text-to-video diffusion prior to ensure temporal consistency across frames. However, 3D causal variational autoencoders often suffer from degraded spatial fidelity due to compression, and directly passing 3D encoder features to decoder layers breaks temporal causality. To address this, we inject temporally agnostic latents from a pretrained 2D encoder into the 3D decoder to restore detail while preserving causal structures. Furthermore, because scene…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques
