Benchmarking Deep Reinforcement Learning for Continuous Control
Yan Duan, Xi Chen, Rein Houthooft, John Schulman, Pieter Abbeel

TL;DR
This paper introduces a comprehensive benchmark suite for continuous control tasks in deep reinforcement learning, enabling standardized evaluation and comparison of algorithms across diverse challenging scenarios.
Contribution
It provides a new benchmark suite with diverse tasks and reference implementations, addressing the lack of standardized evaluation in continuous control RL research.
Findings
Systematic evaluation of multiple RL algorithms on the benchmark.
Identification of strengths and weaknesses of different algorithms.
Facilitation of reproducibility and future research in continuous control.
Abstract
Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning. Some notable examples include training agents to play Atari games based on raw pixel data and to acquire advanced manipulation skills using raw sensory inputs. However, it has been difficult to quantify progress in the domain of continuous control due to the lack of a commonly adopted benchmark. In this work, we present a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, tasks with partial observations, and tasks with hierarchical structure. We report novel findings based on the systematic evaluation of a range of implemented reinforcement learning algorithms. Both the benchmark and reference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Adversarial Robustness in Machine Learning
