Articulat3D: Reconstructing Articulated Digital Twins From Monocular Videos with Geometric and Motion Constraints
Lijun Guo, Haoyu Zhao, Xingyue Zhao, Rong Fu, Linghao Zhuang, Siteng Huang, Zhongyu Li, Hua Zou

TL;DR
Articulat3D introduces a novel framework for reconstructing high-fidelity digital twins of articulated objects from casual monocular videos by leveraging geometric and motion constraints, enabling scalable and accurate 3D modeling.
Contribution
The paper presents a new method that jointly enforces geometric and motion constraints for 3D reconstruction from monocular videos, including a motion prior-driven initialization and a refinement process with learnable kinematic primitives.
Findings
Achieves state-of-the-art results on synthetic benchmarks.
Successfully reconstructs articulated objects from casual monocular videos.
Demonstrates robustness in real-world uncontrolled conditions.
Abstract
Building high-fidelity digital twins of articulated objects from visual data remains a central challenge. Existing approaches depend on multi-view captures of the object in discrete, static states, which severely constrains their real-world scalability. In this paper, we introduce Articulat3D, a novel framework that constructs such digital twins from casually captured monocular videos by jointly enforcing explicit 3D geometric and motion constraints. We first propose Motion Prior-Driven Initialization, which leverages 3D point tracks to exploit the low-dimensional structure of articulated motion. By modeling scene dynamics with a compact set of motion bases, we facilitate soft decomposition of the scene into multiple rigidly-moving groups. Building on this initialization, we introduce Geometric and Motion Constraints Refinement, which enforces physically plausible articulation through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis
