MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds
Jiahui Lei, Yijia Weng, Adam Harley, Leonidas Guibas and, Kostas Daniilidis

TL;DR
MoSca introduces a novel 4D reconstruction system that leverages prior vision models and a Gaussian-based motion scaffold to synthesize dynamic scene views from casual monocular videos, achieving state-of-the-art results.
Contribution
The paper presents MoSca, a new 4D reconstruction framework that encodes motion and deformation using Gaussian-anchored scaffolds, enabling dynamic scene synthesis without specialized pose estimation tools.
Findings
Achieves state-of-the-art dynamic rendering performance.
Effectively disentangles scene geometry and appearance from motion.
Works well on real-world casual videos.
Abstract
We introduce 4D Motion Scaffolds (MoSca), a modern 4D reconstruction system designed to reconstruct and synthesize novel views of dynamic scenes from monocular videos captured casually in the wild. To address such a challenging and ill-posed inverse problem, we leverage prior knowledge from foundational vision models and lift the video data to a novel Motion Scaffold (MoSca) representation, which compactly and smoothly encodes the underlying motions/deformations. The scene geometry and appearance are then disentangled from the deformation field and are encoded by globally fusing the Gaussians anchored onto the MoSca and optimized via Gaussian Splatting. Additionally, camera focal length and poses can be solved using bundle adjustment without the need of any other pose estimation tools. Experiments demonstrate state-of-the-art performance on dynamic rendering benchmarks and its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Advanced Vision and Imaging · Advanced Image Processing Techniques
