GA-Drive: Geometry-Appearance Decoupled Modeling for Free-viewpoint Driving Scene Generation
Hao Zhang, Lue Fan, Qitai Wang, Wenbo Li, Zehuan Wu, Lewei Lu, Zhaoxiang Zhang, Hongsheng Li

TL;DR
GA-Drive is a novel free-viewpoint driving scene generator that decouples geometry and appearance, enabling high-fidelity, editable, and consistent scene synthesis along user-defined trajectories using diffusion models.
Contribution
It introduces a geometry-appearance decoupling framework combined with diffusion-based synthesis for flexible and photorealistic driving scene generation.
Findings
Outperforms existing methods in NTA-IoU, NTL-IoU, and FID scores.
Supports appearance editing while preserving geometry.
Enables generation of novel views along specified trajectories.
Abstract
A free-viewpoint, editable, and high-fidelity driving simulator is crucial for training and evaluating end-to-end autonomous driving systems. In this paper, we present GA-Drive, a novel simulation framework capable of generating camera views along user-specified novel trajectories through Geometry-Appearance Decoupling and Diffusion-Based Generation. Given a set of images captured along a recorded trajectory and the corresponding scene geometry, GA-Drive synthesizes novel pseudo-views using geometry information. These pseudo-views are then transformed into photorealistic views using a trained video diffusion model. In this way, we decouple the geometry and appearance of scenes. An advantage of such decoupling is its support for appearance editing via state-of-the-art video-to-video editing techniques, while preserving the underlying geometry, enabling consistent edits across both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Face recognition and analysis
