Online Meta-learning by Parallel Algorithm Competition

Stefan Elfwing; Eiji Uchibe; Kenji Doya

arXiv:1702.07490·cs.LG·February 27, 2017·2 cites

Online Meta-learning by Parallel Algorithm Competition

Stefan Elfwing, Eiji Uchibe, Kenji Doya

PDF

Open Access

TL;DR

This paper introduces OMPAC, a parallel algorithm competition approach for online meta-learning in reinforcement learning, which adaptively tunes meta-parameters to improve performance in complex tasks.

Contribution

The paper presents a novel parallel meta-learning method, OMPAC, that dynamically adapts meta-parameters during reinforcement learning, outperforming state-of-the-art results in various games.

Findings

01

Improved results in stochastic SZ-Tetris and Tetris by 31% and 84%.

02

Enhanced deep Sarsa(λ) agents in Atari games by over 62%.

03

Demonstrated adaptive meta-parameter tuning during learning.

Abstract

The efficiency of reinforcement learning algorithms depends critically on a few meta-parameters that modulates the learning updates and the trade-off between exploration and exploitation. The adaptation of the meta-parameters is an open question in reinforcement learning, which arguably has become more of an issue recently with the success of deep reinforcement learning in high-dimensional state spaces. The long learning times in domains such as Atari 2600 video games makes it not feasible to perform comprehensive searches of appropriate meta-parameter values. We propose the Online Meta-learning by Parallel Algorithm Competition (OMPAC) method. In the OMPAC method, several instances of a reinforcement learning algorithm are run in parallel with small differences in the initial values of the meta-parameters. After a fixed number of episodes, the instances are selected based on their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Data Stream Mining Techniques