HorizonForge: Driving Scene Editing with Any Trajectories and Any Vehicles
Yifan Wang, Francesco Pittaluga, Zaid Tasneem, Chenyu You, Manmohan Chandraker, Ziyu Jiang

TL;DR
HorizonForge is a novel framework for photorealistic, controllable driving scene editing that uses Gaussian Splats and Meshes for high fidelity and employs a diffusion process for consistent scene variation, advancing autonomous driving simulation.
Contribution
It introduces a unified scene reconstruction method with Gaussian-Mesh representations and a diffusion-based editing process, along with a new benchmark for evaluation.
Findings
Gaussian-Mesh representation outperforms other 3D models in fidelity.
Temporal priors from video diffusion ensure scene coherence.
HorizonForge achieves significant user preference and FID improvements.
Abstract
Controllable driving scene generation is critical for realistic and scalable autonomous driving simulation, yet existing approaches struggle to jointly achieve photorealism and precise control. We introduce HorizonForge, a unified framework that reconstructs scenes as editable Gaussian Splats and Meshes, enabling fine-grained 3D manipulation and language-driven vehicle insertion. Edits are rendered through a noise-aware video diffusion process that enforces spatial and temporal consistency, producing diverse scene variations in a single feed-forward pass without per-trajectory optimization. To standardize evaluation, we further propose HorizonSuite, a comprehensive benchmark spanning ego- and agent-level editing tasks such as trajectory modifications and object manipulation. Extensive experiments show that Gaussian-Mesh representation delivers substantially higher fidelity than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Advanced Vision and Imaging
