Implicit Distributional Reinforcement Learning

Yuguang Yue; Zhendong Wang; Mingyuan Zhou

arXiv:2007.06159·cs.LG·October 21, 2020·5 cites

Implicit Distributional Reinforcement Learning

Yuguang Yue, Zhendong Wang, Mingyuan Zhou

PDF

Open Access 3 Repos 1 Video

TL;DR

This paper introduces Implicit Distributional Actor-Critic (IDAC), a reinforcement learning method that models complex return distributions and policies using implicit and semi-implicit distributions, leading to improved sample efficiency and performance in continuous action tasks.

Contribution

The paper proposes a novel implicit distributional approach with deep generator networks and semi-implicit policies, enhancing modeling flexibility and performance in policy-gradient reinforcement learning.

Findings

01

IDAC outperforms state-of-the-art algorithms on OpenAI Gym benchmarks.

02

The implicit distributional approach captures complex return and policy properties.

03

IDAC demonstrates improved sample efficiency in continuous control tasks.

Abstract

To improve the sample efficiency of policy-gradient based reinforcement learning algorithms, we propose implicit distributional actor-critic (IDAC) that consists of a distributional critic, built on two deep generator networks (DGNs), and a semi-implicit actor (SIA), powered by a flexible policy distribution. We adopt a distributional perspective on the discounted cumulative return and model it with a state-action-dependent implicit distribution, which is approximated by the DGNs that take state-action pairs and random noises as their input. Moreover, we use the SIA to provide a semi-implicit policy distribution, which mixes the policy parameters with a reparameterizable distribution that is not constrained by an analytic density function. In this way, the policy's marginal distribution is implicit, providing the potential to model complex properties such as covariance structure and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Implicit Distributional Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks