Web2Grasp: Learning Functional Grasps from Web Images of Hand-Object Interactions

Hongyi Chen; Yunchao Yao; Yufei Ye; Zhixuan Xu; Homanga Bharadhwaj; Jiashun Wang; Shubham Tulsiani; Zackory Erickson; Jeffrey Ichnowski

arXiv:2505.05517·cs.CV·May 14, 2025

Web2Grasp: Learning Functional Grasps from Web Images of Hand-Object Interactions

Hongyi Chen, Yunchao Yao, Yufei Ye, Zhixuan Xu, Homanga Bharadhwaj, Jiashun Wang, Shubham Tulsiani, Zackory Erickson, Jeffrey Ichnowski

PDF

TL;DR

Web2Grasp leverages web images of human-object interactions to train a functional robotic grasping model, enabling effective manipulation of diverse objects with improved success rates and generalization to unseen items.

Contribution

The paper introduces a novel approach to learn functional grasps from web images by reconstructing human hand-object interactions and retargeting them to robotic hands, bypassing costly demonstrations.

Findings

01

Achieved 75.8% success rate on seen objects in simulation.

02

Improved success rate to 83.4% with simulator-augmented data.

03

Attained 85% success rate in sim-to-real transfer on LEAP Hand.

Abstract

Functional grasp is essential for enabling dexterous multi-finger robot hands to manipulate objects effectively. However, most prior work either focuses on power grasping, which simply involves holding an object still, or relies on costly teleoperated robot demonstrations to teach robots how to grasp each object functionally. Instead, we propose extracting human grasp information from web images since they depict natural and functional object interactions, thereby bypassing the need for curated demonstrations. We reconstruct human hand-object interaction (HOI) 3D meshes from RGB images, retarget the human hand to multi-finger robot hands, and align the noisy object mesh with its accurate 3D shape. We show that these relatively low-quality HOI data from inexpensive web sources can effectively train a functional grasping model. To further expand the grasp dataset for seen and unseen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsALIGN