Veila: Panoramic LiDAR Generation from a Monocular RGB Image

Youquan Liu; Lingdong Kong; Weidong Yang; Ao Liang; Jianxiong Gao; Yang Wu; Xiang Xu; Xin Li; Linfeng Li; Runnan Chen; Ben Fei

arXiv:2508.03690·cs.CV·August 6, 2025

Veila: Panoramic LiDAR Generation from a Monocular RGB Image

Youquan Liu, Lingdong Kong, Weidong Yang, Ao Liang, Jianxiong Gao, Yang Wu, Xiang Xu, Xin Li, Linfeng Li, Runnan Chen, Ben Fei

PDF

TL;DR

Veila is a novel diffusion-based framework that generates realistic panoramic LiDAR data from monocular RGB images, addressing challenges in spatial control, alignment, and structural coherence for autonomous driving applications.

Contribution

It introduces a new conditional diffusion model with adaptive conditioning, robust cross-modal alignment, and global structural coherence for RGB-to-LiDAR data synthesis.

Findings

01

Achieves state-of-the-art fidelity and consistency in LiDAR generation

02

Improves downstream LiDAR semantic segmentation performance

03

Demonstrates effectiveness on multiple benchmark datasets

Abstract

Realistic and controllable panoramic LiDAR data generation is critical for scalable 3D perception in autonomous driving and robotics. Existing methods either perform unconditional generation with poor controllability or adopt text-guided synthesis, which lacks fine-grained spatial control. Leveraging a monocular RGB image as a spatial control signal offers a scalable and low-cost alternative, which remains an open problem. However, it faces three core challenges: (i) semantic and depth cues from RGB are vary spatially, complicating reliable conditioning generation; (ii) modality gaps between RGB appearance and LiDAR geometry amplify alignment errors under noisy diffusion; and (iii) maintaining structural coherence between monocular RGB and panoramic LiDAR is challenging, particularly in non-overlap regions between images and LiDAR. To address these challenges, we propose Veila, a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.