DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation
Zhiqi Li, Yiming Chen, Peidong Liu

TL;DR
DreamMesh4D introduces a novel mesh-based framework with Gaussian splats and a hybrid skinning algorithm to generate high-quality 4D objects from monocular videos, improving spatial-temporal consistency and surface appearance.
Contribution
It combines mesh representation with geometric skinning and Gaussian splats, offering a new approach for 4D video generation that outperforms prior implicit methods.
Findings
Superior spatial-temporal consistency in generated 4D objects
Enhanced surface appearance quality
Compatibility with modern graphics pipelines
Abstract
Recent advancements in 2D/3D generative techniques have facilitated the generation of dynamic 3D objects from monocular videos. Previous methods mainly rely on the implicit neural radiance fields (NeRF) or explicit Gaussian Splatting as the underlying representation, and struggle to achieve satisfactory spatial-temporal consistency and surface appearance. Drawing inspiration from modern 3D animation pipelines, we introduce DreamMesh4D, a novel framework combining mesh representation with geometric skinning technique to generate high-quality 4D object from a monocular video. Instead of utilizing classical texture map for appearance, we bind Gaussian splats to triangle face of mesh for differentiable optimization of both the texture and mesh vertices. In particular, DreamMesh4D begins with a coarse mesh obtained through an image-to-3D generation procedure. Sparse points are then uniformly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Video Analysis and Summarization
