Grasp-Anything: Large-scale Grasp Dataset from Foundation Models
An Dinh Vuong, Minh Nhat Vu, Hieu Le, Baoru Huang, Binh Huynh, Thieu, Vo, Andreas Kugi, Anh Nguyen

TL;DR
This paper introduces Grasp-Anything, a large-scale, diverse grasp dataset synthesized from foundation models, enabling improved zero-shot grasp detection and robotic applications by leveraging extensive real-world knowledge.
Contribution
The paper presents a novel large-scale grasp dataset generated from foundation models, significantly increasing diversity and enabling zero-shot grasp detection in robotics.
Findings
Successfully facilitates zero-shot grasp detection
Enables real-world robotic grasping experiments
Surpasses prior datasets in size and diversity
Abstract
Foundation models such as ChatGPT have made significant strides in robotic tasks due to their universal representation of real-world domains. In this paper, we leverage foundation models to tackle grasp detection, a persistent challenge in robotics with broad industrial applications. Despite numerous grasp datasets, their object diversity remains limited compared to real-world figures. Fortunately, foundation models possess an extensive repository of real-world knowledge, including objects we encounter in our daily lives. As a consequence, a promising solution to the limited representation in previous grasp datasets is to harness the universal knowledge embedded in these foundation models. We present Grasp-Anything, a new large-scale grasp dataset synthesized from foundation models to implement this solution. Grasp-Anything excels in diversity and magnitude, boasting 1M samples with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Adversarial Robustness in Machine Learning · Multimodal Machine Learning Applications
