Sim2Real Transfer for Vision-Based Grasp Verification
Pau Amargant, Peter H\"onig, Markus Vincze

TL;DR
This paper introduces a vision-based system for verifying successful robot grasps, utilizing synthetic data and a two-stage detection and classification architecture to improve accuracy in real-world scenarios.
Contribution
The work presents a novel two-stage vision-based grasp verification method and introduces HSR-GraspSynth, a synthetic dataset for training and testing in diverse grasping scenarios.
Findings
High accuracy achieved in real-world grasp verification
Effective use of synthetic data for training models
Potential for integration into robotic grasping pipelines
Abstract
The verification of successful grasps is a crucial aspect of robot manipulation, particularly when handling deformable objects. Traditional methods relying on force and tactile sensors often struggle with deformable and non-rigid objects. In this work, we present a vision-based approach for grasp verification to determine whether the robotic gripper has successfully grasped an object. Our method employs a two-stage architecture; first YOLO-based object detection model to detect and locate the robot's gripper and then a ResNet-based classifier determines the presence of an object. To address the limitations of real-world data capture, we introduce HSR-GraspSynth, a synthetic dataset designed to simulate diverse grasping scenarios. Furthermore, we explore the use of Visual Question Answering capabilities as a zero-shot baseline to which we compare our model. Experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Multimodal Machine Learning Applications
