Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding
Haoran Zhou, Gim Hee Lee

TL;DR
Motion4D introduces a 4D Gaussian Splatting framework that integrates 2D priors into a unified 3D-consistent scene understanding model, significantly improving motion and semantic coherence in dynamic 3D environments.
Contribution
The paper presents a novel 4D Gaussian Splatting approach with iterative optimization, dynamic motion priors, and semantic refinement, addressing 3D consistency issues in scene understanding from monocular videos.
Findings
Outperforms existing 2D and 3D methods in scene understanding tasks.
Achieves superior 3D motion and semantic consistency.
Enhances dynamic scene analysis with improved accuracy.
Abstract
Recent advancements in foundation models for 2D vision have substantially improved the analysis of dynamic scenes from monocular videos. However, despite their strong generalization capabilities, these models often lack 3D consistency, a fundamental requirement for understanding scene geometry and motion, thereby causing severe spatial misalignment and temporal flickering in complex 3D environments. In this paper, we present Motion4D, a novel framework that addresses these challenges by integrating 2D priors from foundation models into a unified 4D Gaussian Splatting representation. Our method features a two-part iterative optimization framework: 1) Sequential optimization, which updates motion and semantic fields in consecutive stages to maintain local consistency, and 2) Global optimization, which jointly refines all attributes for long-term coherence. To enhance motion accuracy, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · 3D Shape Modeling and Analysis
