From Bird's-Eye to Street View: Crafting Diverse and Condition-Aligned Images with Latent Diffusion Model
Xiaojie Xu, Tianshuo Xu, Fulong Ma, Yingcong Chen

TL;DR
This paper presents a framework that converts Bird's-Eye View maps into multi-view street images using a neural transformation and a fine-tuned latent diffusion model, enhancing traffic scene generation for autonomous driving.
Contribution
It introduces a novel two-step approach combining neural view transformation with fine-tuned diffusion models for accurate, diverse street view image synthesis from BEV maps.
Findings
Effective generation of multi-view street images from BEV maps.
High-quality, diverse, and condition-coherent images produced.
Framework leverages pretrained diffusion models for traffic scene synthesis.
Abstract
We explore Bird's-Eye View (BEV) generation, converting a BEV map into its corresponding multi-view street images. Valued for its unified spatial representation aiding multi-sensor fusion, BEV is pivotal for various autonomous driving applications. Creating accurate street-view images from BEV maps is essential for portraying complex traffic scenarios and enhancing driving algorithms. Concurrently, diffusion-based conditional image generation models have demonstrated remarkable outcomes, adept at producing diverse, high-quality, and condition-aligned results. Nonetheless, the training of these models demands substantial data and computational resources. Hence, exploring methods to fine-tune these advanced models, like Stable Diffusion, for specific conditional generation tasks emerges as a promising avenue. In this paper, we introduce a practical framework for generating images from a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · 3D Surveying and Cultural Heritage
MethodsDiffusion
