Variance Control for Distributional Reinforcement Learning

Qi Kuang; Zhoufan Zhu; Liwen Zhang; Fan Zhou

arXiv:2307.16152·cs.LG·August 1, 2023

Variance Control for Distributional Reinforcement Learning

Qi Kuang, Zhoufan Zhu, Liwen Zhang, Fan Zhou

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper analyzes the error components in distributional reinforcement learning, introduces a new estimator QEM, and demonstrates its improved performance on benchmark tasks.

Contribution

It provides a theoretical error analysis, proposes the QEM estimator, and develops the QEMRL algorithm for better distributional RL performance.

Findings

01

QEMRL outperforms baseline algorithms in sample efficiency.

02

QEMRL shows improved convergence on Atari and Mujoco tasks.

03

Theoretical reduction of bias and variance in distributional RL.

Abstract

Although distributional reinforcement learning (DRL) has been widely examined in the past few years, very few studies investigate the validity of the obtained Q-function estimator in the distributional setting. To fully understand how the approximation errors of the Q-function affect the whole training process, we do some error analysis and theoretically show how to reduce both the bias and the variance of the error terms. With this new understanding, we construct a new estimator \emph{Quantiled Expansion Mean} (QEM) and introduce a new DRL algorithm (QEMRL) from the statistical perspective. We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kuangqi927/qem
pytorchOfficial

Videos

Variance Control for Distributional Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications