Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task
Stephen James, Andrew J. Davison, Edward Johns

TL;DR
This paper demonstrates a simple, scalable end-to-end visuomotor control method that transfers from simulation to real-world multi-stage tasks without real-world training data, achieving robust generalization in complex environments.
Contribution
The authors introduce a domain randomization-based approach for transferring simulation-trained control policies to real robots for multi-stage tasks without real-world data.
Findings
Successfully accomplished multi-stage manipulation in real world
Generalized to dynamic lighting and distractors
Effective for long-horizon tasks
Abstract
End-to-end control for robot manipulation and grasping is emerging as an attractive alternative to traditional pipelined approaches. However, end-to-end methods tend to either be slow to train, exhibit little or no generalisability, or lack the ability to accomplish long-horizon or multi-stage tasks. In this paper, we show how two simple techniques can lead to end-to-end (image to velocity) execution of a multi-stage task, which is analogous to a simple tidying routine, without having seen a single real image. This involves locating, reaching for, and grasping a cube, then locating a basket and dropping the cube inside. To achieve this, robot trajectories are computed in a simulator, to collect a series of control velocities which accomplish the task. Then, a CNN is trained to map observed images to velocities, using domain randomisation to enable generalisation to real world images.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Advanced Vision and Imaging
