What Matters In On-Policy Reinforcement Learning? A Large-Scale   Empirical Study

Marcin Andrychowicz; Anton Raichuk; Piotr Sta\'nczyk; Manu Orsini,; Sertan Girgin; Raphael Marinier; L\'eonard Hussenot; Matthieu Geist; Olivier; Pietquin; Marcin Michalski; Sylvain Gelly; Olivier Bachem

arXiv:2006.05990·cs.LG·June 11, 2020·104 cites

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

Marcin Andrychowicz, Anton Raichuk, Piotr Sta\'nczyk, Manu Orsini,, Sertan Girgin, Raphael Marinier, L\'eonard Hussenot, Matthieu Geist, Olivier, Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper conducts a large-scale empirical study on on-policy reinforcement learning, analyzing over 50 design choices across 250,000 agents to understand their impact and provide practical recommendations.

Contribution

It introduces a unified framework to systematically evaluate the effect of various design decisions in on-policy RL, addressing gaps in understanding and reproducibility.

Findings

01

Certain design choices significantly affect agent performance.

02

Practical guidelines for on-policy RL training are proposed.

03

Insights help align implementations with theoretical algorithms.

Abstract

In recent years, on-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks. While RL algorithms are often conceptually simple, their state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of the resulting agents. Those choices are usually not extensively discussed in the literature, leading to discrepancy between published descriptions of algorithms and their implementations. This makes it hard to attribute progress in RL and slows down overall progress [Engstrom'20]. As a step towards filling that gap, we implement >50 such ``choices'' in a unified on-policy RL framework, allowing us to investigate their impact in a large-scale empirical study. We train over 250'000 agents in five continuous control environments of different complexity and provide insights and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study (Paper Explained)· youtube

Taxonomy

TopicsReinforcement Learning in Robotics · Software Engineering Research · Mobile Crowdsensing and Crowdsourcing