PerLDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models
Jinhua Zhang, Hualian Sheng, Sijia Cai, Bing Deng, Qiao Liang, Wen Li, Ying Fu, Jieping Ye, Shuhang Gu

TL;DR
PerLDiff introduces a novel diffusion-based approach that leverages 3D geometric priors for precise, controllable street view image synthesis, outperforming existing methods in accuracy and robustness.
Contribution
The paper presents PerLDiff, a new diffusion model integrating perspective-layout 3D priors for improved controllability in street view image generation.
Findings
Achieves higher controllability than existing layout control methods.
Demonstrates superior performance on NuScenes and KITTI datasets.
Enhances object-level control in street view synthesis.
Abstract
Controllable generation is considered a potentially vital approach to address the challenge of annotating 3D data, and the precision of such controllable generation becomes particularly imperative in the context of data production for autonomous driving. Existing methods focus on the integration of diverse generative information into controlling inputs, utilizing frameworks such as GLIGEN or ControlNet, to produce commendable outcomes in controllable generation. However, such approaches intrinsically restrict generation performance to the learning capacities of predefined network architectures. In this paper, we explore the innovative integration of controlling information and introduce PerLDiff (\textbf{Per}spective-\textbf{L}ayout \textbf{Diff}usion Models), a novel method for effective street view image generation that fully leverages perspective 3D geometric information. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote Sensing and LiDAR Applications · Automated Road and Building Extraction · Remote Sensing and Land Use
MethodsFocus · Diffusion
