GATER: Learning Grasp-Action-Target Embeddings and Relations for Task-Specific Grasping
Ming Sun, Yue Gao

TL;DR
GATER is a novel embedding-based framework that models the relationships among grasping tools, actions, and targets to enable task-specific robotic grasping in unstructured environments, achieving high success rates and demonstrating real-world applicability.
Contribution
This paper introduces GATER, a new method that models grasping tasks as triplets in embedding space, along with a new dataset for task-specific grasping, advancing robotic manipulation capabilities.
Findings
Achieved 94.6% success rate in task-specific grasping
Validated GATER on a real service robot platform
Demonstrated potential for human-robot interaction
Abstract
Intelligent service robots require the ability to perform a variety of tasks in dynamic environments. Despite the significant progress in robotic grasping, it is still a challenge for robots to decide grasping position when given different tasks in unstructured real life environments. In order to overcome this challenge, creating a proper knowledge representation framework is the key. Unlike the previous work, in this paper, task is defined as a triplet including grasping tool, desired action and target object. Our proposed algorithm GATER (Grasp--Action--Target Embeddings and Relations) models the relationship among grasping tools--action--target objects in embedding space. To validate our method, a novel dataset is created for task-specific grasping. GATER is trained on the new dataset and achieve task-specific grasping inference with 94.6\% success rate. Finally, the effectiveness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition
