GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts
Haoran Geng, Helin Xu, Chengyang Zhao, Chao Xu, Li Yi, Siyuan Huang,, He Wang

TL;DR
This paper introduces GAPartNet, a large-scale dataset with part annotations across multiple object categories, and proposes methods for cross-category part segmentation, pose estimation, and manipulation that generalize well to unseen categories.
Contribution
The work presents a new dataset, GAPartNet, with rich part-level annotations, and develops a domain-generalizable approach for part segmentation, pose estimation, and manipulation across diverse object categories.
Findings
Our method outperforms existing approaches on seen and unseen categories.
Part-based manipulation heuristics enable generalization to new object categories.
GAPartNet provides a valuable resource for cross-category object perception and manipulation.
Abstract
For years, researchers have been devoted to generalizable object perception and manipulation, where cross-category generalizability is highly desired yet underexplored. In this work, we propose to learn such cross-category skills via Generalizable and Actionable Parts (GAParts). By identifying and defining 9 GAPart classes (lids, handles, etc.) in 27 object categories, we construct a large-scale part-centric interactive dataset, GAPartNet, where we provide rich, part-level annotations (semantics, poses) for 8,489 part instances on 1,166 objects. Based on GAPartNet, we investigate three cross-category tasks: part segmentation, part pose estimation, and part-based object manipulation. Given the significant domain gaps between seen and unseen object categories, we propose a robust 3D segmentation method from the perspective of domain generalization by integrating adversarial learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Advanced Neural Network Applications · Human Pose and Action Recognition
