R3DPA: Leveraging 3D Representation Alignment and RGB Pretrained Priors for LiDAR Scene Generation

Nicolas Sereyjol-Garros; Ellington Kirby; Victor Besnier; Nermin Samet

arXiv:2601.07692·cs.CV·February 16, 2026

R3DPA: Leveraging 3D Representation Alignment and RGB Pretrained Priors for LiDAR Scene Generation

Nicolas Sereyjol-Garros, Ellington Kirby, Victor Besnier, Nermin Samet

PDF

Open Access

TL;DR

R3DPA introduces a novel LiDAR scene generation method that leverages image-pretrained priors and self-supervised 3D features to improve realism and enable control, achieving state-of-the-art results on KITTI-360.

Contribution

It is the first to combine image-pretrained priors with self-supervised 3D features for LiDAR scene synthesis, enhancing quality and control.

Findings

01

Achieves state-of-the-art performance on KITTI-360.

02

Enables point cloud control such as inpainting and scene mixing.

03

Significantly improves generation quality through feature alignment.

Abstract

LiDAR scene synthesis is an emerging solution to scarcity in 3D data for robotic tasks such as autonomous driving. Recent approaches employ diffusion or flow matching models to generate realistic scenes, but 3D data remains limited compared to RGB datasets with millions of samples. We introduce R3DPA, the first LiDAR scene generation method to unlock image-pretrained priors for LiDAR point clouds, and leverage self-supervised 3D representations for state-of-the-art results. Specifically, we (i) align intermediate features of our generative model with self-supervised 3D features, which substantially improves generation quality; (ii) transfer knowledge from large-scale image-pretrained generative models to LiDAR generation, mitigating limited LiDAR datasets; and (iii) enable point cloud control at inference for object inpainting and scene mixing with solely an unconditional model. On the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Advanced Vision and Imaging