Learning by Playing - Solving Sparse Reward Tasks from Scratch

Martin Riedmiller; Roland Hafner; Thomas Lampe; Michael Neunert; Jonas; Degrave; Tom Van de Wiele; Volodymyr Mnih; Nicolas Heess; Jost Tobias; Springenberg

arXiv:1802.10567·cs.LG·March 1, 2018·154 cites

Learning by Playing - Solving Sparse Reward Tasks from Scratch

Martin Riedmiller, Roland Hafner, Thomas Lampe, Michael Neunert, Jonas, Degrave, Tom Van de Wiele, Volodymyr Mnih, Nicolas Heess, Jost Tobias, Springenberg

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces SAC-X, a reinforcement learning framework that uses auxiliary tasks and learned scheduling to efficiently learn complex behaviors from scratch in environments with sparse rewards.

Contribution

The paper presents SAC-X, a novel approach that combines auxiliary tasks with learned scheduling to improve exploration and learning in sparse reward settings.

Findings

01

SAC-X outperforms baseline methods in robotic manipulation tasks.

02

Active scheduling of auxiliary policies enhances exploration efficiency.

03

The approach enables learning complex behaviors from scratch.

Abstract

We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm in the context of Reinforcement Learning (RL). SAC-X enables learning of complex behaviors - from scratch - in the presence of multiple sparse reward signals. To this end, the agent is equipped with a set of general auxiliary tasks, that it attempts to learn simultaneously via off-policy RL. The key idea behind our method is that active (learned) scheduling and execution of auxiliary policies allows the agent to efficiently explore its environment - enabling it to excel at sparse reward RL. Our experiments in several challenging robotic manipulation settings demonstrate the power of our approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

DeepMind's AI Learns Complex Behaviors From Scratch | Two Minute Papers #239· youtube

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Adaptive Dynamic Programming Control