iTACO: Interactable Digital Twins of Articulated Objects from Casually Captured RGBD Videos
Weikun Peng, Jun Lv, Cewu Lu, Manolis Savva

TL;DR
iTACO is a novel framework that creates interactable digital twins of articulated objects from casually captured RGBD videos, enabling scalable and practical digitization for robotics and AI applications.
Contribution
The paper introduces iTACO, a coarse-to-fine method for segmenting and analyzing articulated objects from casual RGBD videos, and provides a large new dataset for evaluation.
Findings
iTACO outperforms existing methods on synthetic and real videos.
The dataset contains 784 videos of 284 objects, 20 times larger than prior datasets.
iTACO effectively handles object and camera motion, occlusions, and casual capture conditions.
Abstract
Articulated objects are prevalent in daily life. Interactable digital twins of such objects have numerous applications in embodied AI and robotics. Unfortunately, current methods to digitize articulated real-world objects require carefully captured data, preventing practical, scalable, and generalizable acquisition. We focus on motion analysis and part-level segmentation of an articulated object from a casually captured RGBD video shot with a hand-held camera. A casually captured video of an interaction with an articulated object is easy to obtain at scale using smartphones. However, this setting is challenging due to simultaneous object and camera motion and significant occlusions as the person interacts with the object. To tackle these challenges, we introduce iTACO: a coarse-to-fine framework that infers joint parameters and segments movable parts of the object from a dynamic RGBD…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
