Text2Grasp: Grasp synthesis by text prompts of object grasping parts
Xiaoyun Chang, Yi Sun

TL;DR
Text2Grasp introduces a text-guided diffusion approach for precise, part-level grasp synthesis, enabling controllable and diverse hand-object interactions guided by natural language descriptions.
Contribution
The paper presents a novel two-stage method combining diffusion models and contact optimization for text-guided grasp synthesis, leveraging large language models for enhanced control.
Findings
Achieves accurate part-level grasp control
Maintains high grasp quality comparable to existing methods
Enables task-level and personalized grasp descriptions
Abstract
The hand plays a pivotal role in human ability to grasp and manipulate objects and controllable grasp synthesis is the key for successfully performing downstream tasks. Existing methods that use human intention or task-level language as control signals for grasping inherently face ambiguity. To address this challenge, we propose a grasp synthesis method guided by text prompts of object grasping parts, Text2Grasp, which provides more precise control. Specifically, we present a two-stage method that includes a text-guided diffusion model TextGraspDiff to first generate a coarse grasp pose, then apply a hand-object contact optimization process to ensure both plausibility and diversity. Furthermore, by leveraging Large Language Model, our method facilitates grasp synthesis guided by task-level and personalized text descriptions without additional manual annotations. Extensive experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Robot Manipulation and Learning · Multimodal Machine Learning Applications
