Generalizable task-oriented object grasping through LLM-guided ontology and similarity-based planning
Hao Chen, Takuya Kiyokawa, Weiwei Wan, Kensuke Harada

TL;DR
This paper presents a geometry-centric, LLM-guided approach to task-oriented object grasping that improves generalization across diverse objects and tasks by using an ontology and similarity-based planning.
Contribution
It introduces a novel ontology and geometric analysis framework that enhances the robustness and generalization of task-oriented grasping without relying on semantic visual features.
Findings
High accuracy in functional part selection and grasp generation.
Effective generalization to novel objects and tasks.
Validation through real-world experiments.
Abstract
Task-oriented grasping (TOG) is more challenging than simple object grasping because it requires precise identification of object parts and careful selection of grasping areas to ensure effective and robust manipulation. While recent approaches have trained large-scale vision-language models to integrate part-level object segmentation with task-aware grasp planning, their instability in part recognition and grasp inference limits their ability to generalize across diverse objects and tasks. To address this issue, we introduce a novel, geometry-centric strategy for more generalizable TOG that does not rely on semantic features from visual recognition, effectively overcoming the viewpoint sensitivity of model-based approaches. Our main proposals include: 1) an object-part-task ontology for functional part selection based on intuitive human commands, constructed using a Large Language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
