Structure from Action: Learning Interactions for Articulated Object 3D Structure Discovery
Neil Nie, Samir Yitzhak Gadre, Kiana Ehsani, Shuran Song

TL;DR
This paper presents Structure from Action (SfA), a novel framework that learns 3D part geometry and joint parameters of unseen articulated objects through inferred interactions, enabling accurate reconstruction and generalization to new categories.
Contribution
The introduction of SfA, which combines interaction inference with 3D perception to discover articulated object structures, especially for unseen categories.
Findings
SfA outperforms state-of-the-art methods by 25.4 3D IoU points on unseen categories.
SfA generalizes well from simulation to real-world objects.
SfA accurately segments parts and infers joint parameters across diverse object categories.
Abstract
We introduce Structure from Action (SfA), a framework to discover 3D part geometry and joint parameters of unseen articulated objects via a sequence of inferred interactions. Our key insight is that 3D interaction and perception should be considered in conjunction to construct 3D articulated CAD models, especially for categories not seen during training. By selecting informative interactions, SfA discovers parts and reveals occluded surfaces, like the inside of a closed drawer. By aggregating visual observations in 3D, SfA accurately segments multiple parts, reconstructs part geometry, and infers all joint parameters in a canonical coordinate frame. Our experiments demonstrate that a SfA model trained in simulation can generalize to many unseen object categories with diverse structures and to real-world objects. Empirically, SfA outperforms a pipeline of state-of-the-art components by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Robot Manipulation and Learning · Human Motion and Animation
