DexGANGrasp: Dexterous Generative Adversarial Grasping Synthesis for   Task-Oriented Manipulation

Qian Feng; David S. Martinez Lema; Mohammadhossein Malmir; Hang Li,; Jianxiang Feng; Zhaopeng Chen; Alois Knoll

arXiv:2407.17348·cs.RO·November 26, 2024

DexGANGrasp: Dexterous Generative Adversarial Grasping Synthesis for Task-Oriented Manipulation

Qian Feng, David S. Martinez Lema, Mohammadhossein Malmir, Hang Li,, Jianxiang Feng, Zhaopeng Chen, Alois Knoll

PDF

TL;DR

DexGanGrasp is a real-time, single-view dexterous grasp synthesis method using cGANs and a discriminator for stability assessment, demonstrating superior success rates and extending to task-oriented grasping with multimodal models.

Contribution

The paper introduces DexGanGrasp, a novel real-time dexterous grasp synthesis framework with a new discriminator for grasp evaluation, and extends it to task-oriented grasping using multimodal language and vision models.

Findings

01

Outperforms baseline FFHNet with 18.57% higher success rate in real-world tests.

02

Effective in real-time grasp synthesis from a single view.

03

Successfully extends to task-oriented grasping with multimodal models.

Abstract

We introduce DexGanGrasp, a dexterous grasping synthesis method that generates and evaluates grasps with single view in real time. DexGanGrasp comprises a Conditional Generative Adversarial Networks (cGANs)-based DexGenerator to generate dexterous grasps and a discriminator-like DexEvalautor to assess the stability of these grasps. Extensive simulation and real-world expriments showcases the effectiveness of our proposed method, outperforming the baseline FFHNet with an 18.57% higher success rate in real-world evaluation. We further extend DexGanGrasp to DexAfford-Prompt, an open-vocabulary affordance grounding pipeline for dexterous grasping leveraging Multimodal Large Language Models (MLLMs) and Vision Language Models (VLMs), to achieve task-oriented grasping with successful real-world deployments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.