Continuous Control With Ensemble Deep Deterministic Policy Gradients

Piotr Januszewski; Mateusz Olko; Micha{\l} Kr\'olikowski; Jakub; \'Swi\k{a}tkowski; Marcin Andrychowicz; {\L}ukasz Kuci\'nski; Piotr; Mi{\l}o\'s

arXiv:2111.15382·cs.LG·December 1, 2021·1 cites

Continuous Control With Ensemble Deep Deterministic Policy Gradients

Piotr Januszewski, Mateusz Olko, Micha{\l} Kr\'olikowski, Jakub, \'Swi\k{a}tkowski, Marcin Andrychowicz, {\L}ukasz Kuci\'nski, Piotr, Mi{\l}o\'s

PDF

Open Access 1 Repo

TL;DR

This paper empirically investigates various components of deep reinforcement learning in continuous control, revealing insights that lead to the development of the ED2 method, which achieves state-of-the-art results with practical simplicity.

Contribution

The paper introduces ED2, a novel ensemble-based approach that combines multiple insights to improve continuous control performance in deep RL.

Findings

01

Ensembling multiple actors improves performance.

02

Existing methods are unstable across training conditions.

03

Posterior sampling exploration outperforms UCB-based methods.

Abstract

The growth of deep reinforcement learning (RL) has brought multiple exciting tools and methods to the field. This rapid expansion makes it important to understand the interplay between individual elements of the RL toolbox. We approach this task from an empirical perspective by conducting a study in the continuous control setting. We present multiple insights of fundamental nature, including: an average of multiple actors trained from the same data boosts performance; the existing methods are unstable across training runs, epochs of training, and evaluation runs; a commonly used additive action noise is not required for effective training; a strategy based on posterior sampling explores better than the approximated UCB combined with the weighted Bellman backup; the weighted Bellman backup alone cannot replace the clipped double Q-Learning; the critics' initialization plays the major…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ed2-paper/ed2
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Model Reduction and Neural Networks · Adversarial Robustness in Machine Learning