How Does Return Distribution in Distributional Reinforcement Learning   Help Optimization?

Ke Sun; Bei Jiang; Linglong Kong

arXiv:2209.14513·cs.LG·September 24, 2024

How Does Return Distribution in Distributional Reinforcement Learning Help Optimization?

Ke Sun, Bei Jiang, Linglong Kong

PDF

Open Access

TL;DR

This paper investigates how the return distribution in distributional reinforcement learning enhances optimization stability and acceleration, providing theoretical insights and empirical validation within the Neural Fitted Z-Iteration framework.

Contribution

It reveals the optimization benefits of return distribution in distributional RL, including smooth gradients and acceleration effects, supported by theoretical analysis and experiments.

Findings

01

Distributional RL has stable gradients due to its loss smoothness.

02

Return distribution approximation quality influences optimization speed.

03

Experiments confirm improved stability and acceleration over classical RL.

Abstract

Distributional reinforcement learning, which focuses on learning the entire return distribution instead of only its expectation in standard RL, has demonstrated remarkable success in enhancing performance. Despite these advancements, our comprehension of how the return distribution within distributional RL still remains limited. In this study, we investigate the optimization advantages of distributional RL by utilizing its extra return distribution knowledge over classical RL within the Neural Fitted Z-Iteration~(Neural FZI) framework. To begin with, we demonstrate that the distribution loss of distributional RL has desirable smoothness characteristics and hence enjoys stable gradients, which is in line with its tendency to promote optimization stability. Furthermore, the acceleration effect of distributional RL is revealed by decomposing the return distribution. It shows that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Neural Networks and Applications