Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance
Toan Nguyen, Minh Nhat Vu, Baoru Huang, An Vuong, Quan Vuong, Ngan Le,, Thieu Vo, Anh Nguyen

TL;DR
This paper introduces a language-driven 6-DoF grasp detection method using negative prompt guidance, enabling robots to understand natural language commands for grasping specific objects in cluttered environments, with a new large-scale dataset and diffusion model.
Contribution
The paper presents a novel negative prompt guidance strategy within a diffusion model for language-driven 6-DoF grasp detection, along with the creation of a large-scale dataset for this task.
Findings
Outperforms baseline methods in benchmarking experiments.
Effective in real-world robotic grasping scenarios.
Enables natural language command-based object grasping.
Abstract
6-DoF grasp detection has been a fundamental and challenging problem in robotic vision. While previous works have focused on ensuring grasp stability, they often do not consider human intention conveyed through natural language, hindering effective collaboration between robots and users in complex 3D environments. In this paper, we present a new approach for language-driven 6-DoF grasp detection in cluttered point clouds. We first introduce Grasp-Anything-6D, a large-scale dataset for the language-driven 6-DoF grasp detection task with 1M point cloud scenes and more than 200M language-associated 3D grasp poses. We further introduce a novel diffusion model that incorporates a new negative prompt guidance learning strategy. The proposed negative prompt strategy directs the detection process toward the desired object while steering away from unwanted ones given the language input. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Multimodal Machine Learning Applications · Network Packet Processing and Optimization
MethodsDiffusion
