Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for   Continuous Control

Riashat Islam; Peter Henderson; Maziar Gomrokchi; Doina Precup

arXiv:1708.04133·cs.LG·August 15, 2017·186 cites

Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control

Riashat Islam, Peter Henderson, Maziar Gomrokchi, Doina Precup

PDF

Open Access 1 Repo

TL;DR

This paper examines the reproducibility challenges of benchmarked deep reinforcement learning algorithms for continuous control, emphasizing hyper-parameter tuning, variance, and reporting standards to improve experimental consistency.

Contribution

It provides an analysis of reproducibility issues in policy gradient methods and offers guidelines for better reporting and comparison practices in continuous control tasks.

Findings

01

Hyper-parameters significantly affect policy gradient performance.

02

Variance in algorithms impacts reproducibility of results.

03

Guidelines improve clarity and comparability of experimental outcomes.

Abstract

Policy gradient methods in reinforcement learning have become increasingly prevalent for state-of-the-art performance in continuous control tasks. Novel methods typically benchmark against a few key algorithms such as deep deterministic policy gradients and trust region policy optimization. As such, it is important to present and use consistent baselines experiments. However, this can be difficult due to general variance in the algorithms, hyper-parameter tuning, and environment stochasticity. We investigate and discuss: the significance of hyper-parameters in policy gradients for continuous control, general variance in the algorithms, and reproducibility of reported results. We provide guidelines on reporting novel results as comparisons against baseline methods such that future researchers can make informed decisions when investigating novel methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Breakend/ReproducibilityInContinuousPolicyGradientMethods
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Fuel Cells and Related Materials · Advanced Memory and Neural Computing