VidSplat: Gaussian Splatting Reconstruction with Geometry-Guided Video Diffusion Priors
Jimin Tang, Wenyuan Zhang, Junsheng Zhou, Zian Huang, Kanle Shi, Shenkun Xu, Yu-Shen Liu, Zhizhong Han

TL;DR
VidSplat introduces a training-free, generative framework leveraging video diffusion priors to improve 3D scene reconstruction from sparse inputs, effectively synthesizing missing views and recovering complete scenes.
Contribution
It presents a novel, training-free, stage-wise denoising and iterative sampling approach that integrates generation with reconstruction for sparse-view 3D scene recovery.
Findings
Outperforms existing methods on standard benchmarks.
Robustly reconstructs scenes from a single image or sparse views.
Effectively synthesizes unobserved regions to complete 3D scenes.
Abstract
Gaussian Splatting has achieved remarkable progress in multi-view surface reconstruction, yet it exhibits notable degradation when only few views are available. Although recent efforts alleviate this issue by enhancing multi-view consistency to produce plausible surfaces, they struggle to infer unseen, occluded, or weakly constrained regions beyond the input coverage. To address this limitation, we present VidSplat, a training-free generative reconstruction framework that leverages powerful video diffusion priors to iteratively synthesize novel views that compensate for missing input coverage, and thereby recover complete 3D scenes from sparse inputs. Specifically, we tackle two key challenges that enable the effective integration of generation and reconstruction. First, for 3D consistent generation, we elaborate a training-free, stage-wise denoising strategy that adaptively guides the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
