Pointmap-Conditioned Diffusion for Consistent Novel View Synthesis

Thang-Anh-Quan Nguyen; Nathan Piasco; Luis Rold\~ao; Moussab Bennehar; Dzmitry Tsishkou; Laurent Caraffa; Jean-Philippe Tarel; Roland Br\'emond

arXiv:2501.02913·cs.CV·December 25, 2025

Pointmap-Conditioned Diffusion for Consistent Novel View Synthesis

Thang-Anh-Quan Nguyen, Nathan Piasco, Luis Rold\~ao, Moussab Bennehar, Dzmitry Tsishkou, Laurent Caraffa, Jean-Philippe Tarel, Roland Br\'emond

PDF

Open Access

TL;DR

PointmapDiff is a novel framework that uses point maps and pre-trained diffusion models to generate consistent, high-quality novel views in urban scenes, effectively handling limited data sources like sparse LiDAR and RGB captures.

Contribution

It introduces point map conditioning with reference attention and ControlNet to enhance view synthesis accuracy and geometric fidelity in complex urban environments.

Findings

01

Achieves high-quality, consistent view synthesis in urban driving scenes.

02

Effectively utilizes sparse LiDAR and depth maps for conditioning.

03

Can be distilled into 3D representations like Gaussian Splatting.

Abstract

Synthesizing extrapolated views remains a difficult task, especially in urban driving scenes, where the only reliable sources of data are limited RGB captures and sparse LiDAR points. To address this problem, we present PointmapDiff, a framework for novel view synthesis that utilizes pre-trained 2D diffusion models. Our method leverages point maps (i.e., rasterized 3D scene coordinates) as a conditioning signal, capturing geometric and photometric priors from the reference images to guide the image generation process. With the proposed reference attention layers and ControlNet for point map features, PointmapDiff can generate accurate and consistent results across varying viewpoints while respecting geometric fidelity. Experiments on real-life driving data demonstrate that our method achieves high-quality generation with flexibility over point map conditioning signals (e.g., dense depth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Vision and Imaging · Simulation and Modeling Applications

MethodsSoftmax · Attention Is All You Need · Diffusion