Randomized Ensembled Double Q-Learning: Learning Fast Without a Model

Xinyue Chen; Che Wang; Zijian Zhou; Keith Ross

arXiv:2101.05982·cs.LG·March 19, 2021·26 cites

Randomized Ensembled Double Q-Learning: Learning Fast Without a Model

Xinyue Chen, Che Wang, Zijian Zhou, Keith Ross

PDF

Open Access 5 Repos 1 Models 1 Video

TL;DR

REDQ is a simple, model-free reinforcement learning algorithm that achieves high sample efficiency and competitive performance in continuous-action tasks by using a high update-to-data ratio, ensembles, and in-target minimization.

Contribution

REDQ introduces a novel combination of techniques enabling high sample efficiency in model-free DRL for continuous actions, outperforming existing methods.

Findings

01

REDQ matches or exceeds state-of-the-art model-based algorithms on MuJoCo benchmarks.

02

REDQ uses fewer parameters and less runtime than comparable model-based methods.

03

First successful application of high UTD ratio in a model-free DRL algorithm for continuous actions.

Abstract

Using a high Update-To-Data (UTD) ratio, model-based methods have recently achieved much higher sample efficiency than previous model-free methods for continuous-action DRL benchmarks. In this paper, we introduce a simple model-free algorithm, Randomized Ensembled Double Q-Learning (REDQ), and show that its performance is just as good as, if not better than, a state-of-the-art model-based algorithm for the MuJoCo benchmark. Moreover, REDQ can achieve this performance using fewer parameters than the model-based method, and with less wall-clock run time. REDQ has three carefully integrated ingredients which allow it to achieve its high performance: (i) a UTD ratio >> 1; (ii) an ensemble of Q functions; (iii) in-target minimization across a random subset of Q functions from the ensemble. Through carefully designed experiments, we provide a detailed analysis of REDQ and related model-free…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
guitaa/redq-ml-agents-crawler
model

Videos

Randomized Ensembled Double Q-Learning: Learning Fast Without a Model· slideslive

Taxonomy

TopicsMachine Learning and ELM · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications

MethodsDouble Q-learning · Q-Learning