DexVIP: Learning Dexterous Grasping with Human Hand Pose Priors from   Video

Priyanka Mandikal; Kristen Grauman

arXiv:2202.00164·cs.RO·February 2, 2022·5 cites

DexVIP: Learning Dexterous Grasping with Human Hand Pose Priors from Video

Priyanka Mandikal, Kristen Grauman

PDF

Open Access

TL;DR

DexVIP leverages in-the-wild videos to learn dexterous robotic grasping by incorporating human hand pose priors, enabling scalable, efficient, and demonstration-free training for complex robotic hands.

Contribution

We introduce DexVIP, a novel method that uses human hand pose priors from YouTube videos to train dexterous grasping policies via deep reinforcement learning.

Findings

01

Outperforms existing methods without hand pose priors

02

Requires less training time compared to tele-operation-based approaches

03

Successfully generalizes to 27 different objects

Abstract

Dexterous multi-fingered robotic hands have a formidable action space, yet their morphological similarity to the human hand holds immense potential to accelerate robot learning. We propose DexVIP, an approach to learn dexterous robotic grasping from human-object interactions present in in-the-wild YouTube videos. We do this by curating grasp images from human-object interaction videos and imposing a prior over the agent's hand pose when learning to grasp with deep reinforcement learning. A key advantage of our method is that the learned policy is able to leverage free-form in-the-wild visual data. As a result, it can easily scale to new objects, and it sidesteps the standard practice of collecting human demonstrations in a lab -- a much more expensive and indirect way to capture human expertise. Through experiments on 27 objects with a 30-DoF simulated robot hand, we demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Robot Manipulation and Learning