NavCrafter: Exploring 3D Scenes from a Single Image

Hongbo Duan; Peiyu Zhuang; Yi Liu; Zhengyang Zhang; Yuxin Zhang; Pengting Luo; Fangming Liu; Xueqian Wang

arXiv:2604.02828·cs.CV·April 6, 2026

NavCrafter: Exploring 3D Scenes from a Single Image

Hongbo Duan, Peiyu Zhuang, Yi Liu, Zhengyang Zhang, Yuxin Zhang, Pengting Luo, Fangming Liu, Xueqian Wang

PDF

TL;DR

NavCrafter is a novel framework that synthesizes controllable 3D scenes from a single image, enabling high-quality novel-view video generation with improved 3D reconstruction.

Contribution

It introduces a multi-stage camera control mechanism, a collision-aware trajectory planner, and an enhanced 3D Gaussian Splatting pipeline for better 3D scene synthesis.

Findings

01

Achieves state-of-the-art novel-view synthesis under large viewpoint shifts.

02

Substantially improves 3D reconstruction fidelity.

03

Demonstrates effective scene coverage expansion from a single image.

Abstract

Creating flexible 3D scenes from a single image is vital when direct 3D data acquisition is costly or impractical. We introduce NavCrafter, a novel framework that explores 3D scenes from a single image by synthesizing novel-view video sequences with camera controllability and temporal-spatial consistency. NavCrafter leverages video diffusion models to capture rich 3D priors and adopts a geometry-aware expansion strategy to progressively extend scene coverage. To enable controllable multi-view synthesis, we introduce a multi-stage camera control mechanism that conditions diffusion models with diverse trajectories via dual-branch camera injection and attention modulation. We further propose a collision-aware camera trajectory planner and an enhanced 3D Gaussian Splatting (3DGS) pipeline with depth-aligned supervision, structural regularization and refinement. Extensive experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.