GA-Drive: Geometry-Appearance Decoupled Modeling for Free-viewpoint Driving Scene Generation

Hao Zhang; Lue Fan; Qitai Wang; Wenbo Li; Zehuan Wu; Lewei Lu; Zhaoxiang Zhang; Hongsheng Li

arXiv:2602.20673·cs.CV·March 16, 2026

GA-Drive: Geometry-Appearance Decoupled Modeling for Free-viewpoint Driving Scene Generation

Hao Zhang, Lue Fan, Qitai Wang, Wenbo Li, Zehuan Wu, Lewei Lu, Zhaoxiang Zhang, Hongsheng Li

PDF

Open Access

TL;DR

GA-Drive is a novel free-viewpoint driving scene generator that decouples geometry and appearance, enabling high-fidelity, editable, and consistent scene synthesis along user-defined trajectories using diffusion models.

Contribution

It introduces a geometry-appearance decoupling framework combined with diffusion-based synthesis for flexible and photorealistic driving scene generation.

Findings

01

Outperforms existing methods in NTA-IoU, NTL-IoU, and FID scores.

02

Supports appearance editing while preserving geometry.

03

Enables generation of novel views along specified trajectories.

Abstract

A free-viewpoint, editable, and high-fidelity driving simulator is crucial for training and evaluating end-to-end autonomous driving systems. In this paper, we present GA-Drive, a novel simulation framework capable of generating camera views along user-specified novel trajectories through Geometry-Appearance Decoupling and Diffusion-Based Generation. Given a set of images captured along a recorded trajectory and the corresponding scene geometry, GA-Drive synthesizes novel pseudo-views using geometry information. These pseudo-views are then transformed into photorealistic views using a trained video diffusion model. In this way, we decouple the geometry and appearance of scenes. An advantage of such decoupling is its support for appearance editing via state-of-the-art video-to-video editing techniques, while preserving the underlying geometry, enabling consistent edits across both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Face recognition and analysis