Self-Paced Deep Reinforcement Learning

Pascal Klink; Carlo D'Eramo; Jan Peters; Joni Pajarinen

arXiv:2004.11812·cs.LG·October 26, 2020·1 cites

Self-Paced Deep Reinforcement Learning

Pascal Klink, Carlo D'Eramo, Jan Peters, Joni Pajarinen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a method for automatic curriculum generation in reinforcement learning by framing it as an inference problem, leading to improved learning efficiency and stability across various environments.

Contribution

It proposes a novel inference-based approach for automatic curriculum generation that adapts to the agent, with strong theoretical backing and practical integration with deep RL.

Findings

01

Curricula generated improve learning speed and stability.

02

Method outperforms existing CRL algorithms in multiple environments.

03

Approach is easily integrated with deep RL algorithms.

Abstract

Curriculum reinforcement learning (CRL) improves the learning speed and stability of an agent by exposing it to a tailored series of tasks throughout learning. Despite empirical successes, an open question in CRL is how to automatically generate a curriculum for a given reinforcement learning (RL) agent, avoiding manual design. In this paper, we propose an answer by interpreting the curriculum generation as an inference problem, where distributions over tasks are progressively learned to approach the target task. This approach leads to an automatic curriculum generation, whose pace is controlled by the agent, with solid theoretical motivation and easily integrated with deep RL algorithms. In the conducted experiments, the curricula generated with the proposed algorithm significantly improve learning performance across several environments and deep RL algorithms, matching or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

psclklnk/spdl
noneOfficial

Videos

Self-Paced Deep Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Machine Learning and Data Classification

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings