Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels
Yikai Wang, Xinzhou Wang, Zilong Chen, Zhengyi Wang, Fuchun Sun, Jun, Zhu

TL;DR
Vidu4D introduces a novel 4D reconstruction method from single generated videos, utilizing Dynamic Gaussian Surfels to accurately model motion and deformation for high-fidelity virtual content creation.
Contribution
The paper presents Dynamic Gaussian Surfels (DGS), a new technique for precise 4D reconstruction from single videos, addressing non-rigidity and frame distortion challenges.
Findings
Achieves high-fidelity 4D reconstructions with spatial and temporal coherence.
Reduces texture flickering during warping, capturing fine details.
Demonstrates effective text-to-4D generation with existing video models.
Abstract
Video generative models are receiving particular attention given their ability to generate realistic and imaginative frames. Besides, these models are also observed to exhibit strong 3D consistency, significantly enhancing their potential to act as world simulators. In this work, we present Vidu4D, a novel reconstruction model that excels in accurately reconstructing 4D (i.e., sequential 3D) representations from single generated videos, addressing challenges associated with non-rigidity and frame distortion. This capability is pivotal for creating high-fidelity virtual contents that maintain both spatial and temporal coherence. At the core of Vidu4D is our proposed Dynamic Gaussian Surfels (DGS) technique. DGS optimizes time-varying warping functions to transform Gaussian surfels (surface elements) from a static state to a dynamically warped state. This transformation enables a precise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Vision and Imaging
