Pointmap-Conditioned Diffusion for Consistent Novel View Synthesis
Thang-Anh-Quan Nguyen, Nathan Piasco, Luis Rold\~ao, Moussab Bennehar, Dzmitry Tsishkou, Laurent Caraffa, Jean-Philippe Tarel, Roland Br\'emond

TL;DR
PointmapDiff is a novel framework that uses point maps and pre-trained diffusion models to generate consistent, high-quality novel views in urban scenes, effectively handling limited data sources like sparse LiDAR and RGB captures.
Contribution
It introduces point map conditioning with reference attention and ControlNet to enhance view synthesis accuracy and geometric fidelity in complex urban environments.
Findings
Achieves high-quality, consistent view synthesis in urban driving scenes.
Effectively utilizes sparse LiDAR and depth maps for conditioning.
Can be distilled into 3D representations like Gaussian Splatting.
Abstract
Synthesizing extrapolated views remains a difficult task, especially in urban driving scenes, where the only reliable sources of data are limited RGB captures and sparse LiDAR points. To address this problem, we present PointmapDiff, a framework for novel view synthesis that utilizes pre-trained 2D diffusion models. Our method leverages point maps (i.e., rasterized 3D scene coordinates) as a conditioning signal, capturing geometric and photometric priors from the reference images to guide the image generation process. With the proposed reference attention layers and ControlNet for point map features, PointmapDiff can generate accurate and consistent results across varying viewpoints while respecting geometric fidelity. Experiments on real-life driving data demonstrate that our method achieves high-quality generation with flexibility over point map conditioning signals (e.g., dense depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Vision and Imaging · Simulation and Modeling Applications
MethodsSoftmax · Attention Is All You Need · Diffusion
