Simultaneously Learning Vision and Feature-based Control Policies for   Real-world Ball-in-a-Cup

Devin Schwab; Tobias Springenberg; Murilo F. Martins; Thomas Lampe,; Michael Neunert; Abbas Abdolmaleki; Tim Hertweck; Roland Hafner; Francesco; Nori; Martin Riedmiller

arXiv:1902.04706·cs.LG·February 20, 2019·6 cites

Simultaneously Learning Vision and Feature-based Control Policies for Real-world Ball-in-a-Cup

Devin Schwab, Tobias Springenberg, Murilo F. Martins, Thomas Lampe,, Michael Neunert, Abbas Abdolmaleki, Tim Hertweck, Roland Hafner, Francesco, Nori, Martin Riedmiller

PDF

Open Access

TL;DR

This paper introduces a multi-task reinforcement learning approach that accelerates the training of vision-based control policies for real robots, demonstrated on a Ball-in-a-Cup task with significant speed-ups over standard methods.

Contribution

The paper extends the SAC-X framework by incorporating auxiliary tasks with task features available only during training, enabling fast learning of vision-based policies from scratch.

Findings

01

Significant speed-up in learning compared to standard SAC-X.

02

Successful real-world learning of a Ball-in-a-Cup task without transfer or imitation.

03

Effective use of auxiliary tasks with training-only features for policy training.

Abstract

We present a method for fast training of vision based control policies on real robots. The key idea behind our method is to perform multi-task Reinforcement Learning with auxiliary tasks that differ not only in the reward to be optimized but also in the state-space in which they operate. In particular, we allow auxiliary task policies to utilize task features that are available only at training-time. This allows for fast learning of auxiliary policies, which subsequently generate good data for training the main, vision-based control policies. This method can be seen as an extension of the Scheduled Auxiliary Control (SAC-X) framework. We demonstrate the efficacy of our method by using both a simulated and real-world Ball-in-a-Cup game controlled by a robot arm. In simulation, our approach leads to significant learning speed-ups when compared to standard SAC-X. On the real robot we show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Video Surveillance and Tracking Methods