Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection
Sergey Levine, Peter Pastor, Alex Krizhevsky, Deirdre Quillen

TL;DR
This paper presents a deep learning approach for robotic hand-eye coordination in grasping, trained on a large-scale dataset, enabling real-time, successful grasping of novel objects without camera calibration.
Contribution
Introduces a neural network trained on extensive data to learn hand-eye coordination for grasping from monocular images, independent of camera calibration or robot pose.
Findings
Achieves effective real-time grasping control.
Successfully generalizes to novel objects.
Corrects mistakes through continuous servoing.
Abstract
We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination for grasping, we trained a large convolutional neural network to predict the probability that task-space motion of the gripper will result in successful grasps, using only monocular camera images and independently of camera calibration or the current robot pose. This requires the network to observe the spatial relationship between the gripper and objects in the scene, thus learning hand-eye coordination. We then use this network to servo the gripper in real time to achieve successful grasps. To train our network, we collected over 800,000 grasp attempts over the course of two months, using between 6 and 14 robotic manipulators at any given time, with differences in camera placement and hardware. Our experimental evaluation demonstrates that our method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
