Training-free Task-oriented Grasp Generation

Jiaming Wang; Diwen Liu; Jizhuo Chen; Harold Soh

arXiv:2502.04873·cs.RO·October 7, 2025

Training-free Task-oriented Grasp Generation

Jiaming Wang, Diwen Liu, Jizhuo Chen, Harold Soh

PDF

Open Access

TL;DR

This paper introduces a training-free approach for task-oriented robotic grasping that combines pre-trained models with vision-language models to improve success and task compliance without additional training.

Contribution

It proposes a novel training-free pipeline that leverages vision-language models for task-specific grasp generation, enhancing performance over traditional methods.

Findings

01

Up to 36.9% improvement in grasp success rate.

02

Effective utilization of vision-language models for task-specific grasping.

03

Significant enhancement over baseline in task compliance.

Abstract

This paper presents a training-free pipeline for task-oriented grasp generation that combines pre-trained grasp generation models with vision-language models (VLMs). Unlike traditional approaches that focus solely on stable grasps, our method incorporates task-specific requirements by leveraging the semantic reasoning capabilities of VLMs. We evaluate five querying strategies, each utilizing different visual representations of candidate grasps, and demonstrate significant improvements over a baseline method in both grasp success and task compliance rates, with absolute gains of up to 36.9\% in overall success rate. Our results underline the potential of VLMs to enhance task-oriented manipulation, providing insights for future research in robotic grasping and human-robot interaction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Teaching and Learning Programming · Robot Manipulation and Learning