ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
Wangbo Yu, Jinbo Xing, Li Yuan, Wenbo Hu, Xiaoyu Li, Zhipeng Huang,, Xiangjun Gao, Tien-Tsin Wong, Ying Shan, Yonghong Tian

TL;DR
ViewCrafter introduces a novel approach combining video diffusion models and point-based 3D clues to synthesize high-fidelity, consistent novel views from sparse images, enabling immersive and scene-level text-to-3D applications.
Contribution
The paper presents a new method that leverages video diffusion models with iterative view synthesis and camera planning to improve novel view synthesis from limited input images.
Findings
Demonstrates strong generalization across diverse datasets.
Achieves high-fidelity and consistent novel view synthesis.
Enables real-time rendering and scene-level text-to-3D generation.
Abstract
Despite recent advancements in neural 3D reconstruction, the dependence on dense multi-view captures restricts their broader applicability. In this work, we propose \textbf{ViewCrafter}, a novel method for synthesizing high-fidelity novel views of generic scenes from single or sparse images with the prior of video diffusion model. Our method takes advantage of the powerful generation capabilities of video diffusion model and the coarse 3D clues offered by point-based representation to generate high-quality video frames with precise camera pose control. To further enlarge the generation range of novel views, we tailored an iterative view synthesis strategy together with a camera trajectory planning algorithm to progressively extend the 3D clues and the areas covered by the novel views. With ViewCrafter, we can facilitate various applications, such as immersive experiences with real-time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies · Advanced Vision and Imaging · Advanced Image Processing Techniques
MethodsDiffusion
