Sim2Real Transfer for Vision-Based Grasp Verification

Pau Amargant; Peter H\"onig; Markus Vincze

arXiv:2505.03046·cs.RO·May 7, 2025

Sim2Real Transfer for Vision-Based Grasp Verification

Pau Amargant, Peter H\"onig, Markus Vincze

PDF

Open Access 1 Repo

TL;DR

This paper introduces a vision-based system for verifying successful robot grasps, utilizing synthetic data and a two-stage detection and classification architecture to improve accuracy in real-world scenarios.

Contribution

The work presents a novel two-stage vision-based grasp verification method and introduces HSR-GraspSynth, a synthetic dataset for training and testing in diverse grasping scenarios.

Findings

01

High accuracy achieved in real-world grasp verification

02

Effective use of synthetic data for training models

03

Potential for integration into robotic grasping pipelines

Abstract

The verification of successful grasps is a crucial aspect of robot manipulation, particularly when handling deformable objects. Traditional methods relying on force and tactile sensors often struggle with deformable and non-rigid objects. In this work, we present a vision-based approach for grasp verification to determine whether the robotic gripper has successfully grasped an object. Our method employs a two-stage architecture; first YOLO-based object detection model to detect and locate the robot's gripper and then a ResNet-based classifier determines the presence of an object. To address the limitations of real-world data capture, we introduce HSR-GraspSynth, a synthetic dataset designed to simulate diverse grasping scenarios. Furthermore, we explore the use of Visual Question Answering capabilities as a zero-shot baseline to which we compare our model. Experimental results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pauamargant/hsr-graspsynth
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Multimodal Machine Learning Applications