Unbiased Deep Reinforcement Learning: A General Training Framework for   Existing and Future Algorithms

Huihui Zhang; Wu Huang

arXiv:2005.07782·cs.LG·May 19, 2020

Unbiased Deep Reinforcement Learning: A General Training Framework for Existing and Future Algorithms

Huihui Zhang, Wu Huang

PDF

Open Access

TL;DR

This paper introduces a new, unbiased training framework for deep reinforcement learning that improves sample efficiency and convergence, applicable to both existing and future algorithms across discrete and continuous tasks.

Contribution

The authors propose a general, unbiased training framework using Monte Carlo sampling and batch updates, enhancing efficiency and convergence in deep reinforcement learning.

Findings

01

Outperforms existing methods in sample efficiency and convergence rate.

02

Applicable to both discrete and continuous control problems.

03

Enables generalization of algorithms within the new framework.

Abstract

In recent years deep neural networks have been successfully applied to the domains of reinforcement learning \cite{bengio2009learning,krizhevsky2012imagenet,hinton2006reducing}. Deep reinforcement learning \cite{mnih2015human} is reported to have the advantage of learning effective policies directly from high-dimensional sensory inputs over traditional agents. However, within the scope of the literature, there is no fundamental change or improvement on the existing training framework. Here we propose a novel training framework that is conceptually comprehensible and potentially easy to be generalized to all feasible algorithms for reinforcement learning. We employ Monte-carlo sampling to achieve raw data inputs, and train them in batch to achieve Markov decision process sequences and synchronously update the network parameters instead of experience replay. This training framework proves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Adaptive Dynamic Programming Control