Implicit Distributional Reinforcement Learning
Yuguang Yue, Zhendong Wang, Mingyuan Zhou

TL;DR
This paper introduces Implicit Distributional Actor-Critic (IDAC), a reinforcement learning method that models complex return distributions and policies using implicit and semi-implicit distributions, leading to improved sample efficiency and performance in continuous action tasks.
Contribution
The paper proposes a novel implicit distributional approach with deep generator networks and semi-implicit policies, enhancing modeling flexibility and performance in policy-gradient reinforcement learning.
Findings
IDAC outperforms state-of-the-art algorithms on OpenAI Gym benchmarks.
The implicit distributional approach captures complex return and policy properties.
IDAC demonstrates improved sample efficiency in continuous control tasks.
Abstract
To improve the sample efficiency of policy-gradient based reinforcement learning algorithms, we propose implicit distributional actor-critic (IDAC) that consists of a distributional critic, built on two deep generator networks (DGNs), and a semi-implicit actor (SIA), powered by a flexible policy distribution. We adopt a distributional perspective on the discounted cumulative return and model it with a state-action-dependent implicit distribution, which is approximated by the DGNs that take state-action pairs and random noises as their input. Moreover, we use the SIA to provide a semi-implicit policy distribution, which mixes the policy parameters with a reparameterizable distribution that is not constrained by an analytic density function. In this way, the policy's marginal distribution is implicit, providing the potential to model complex properties such as covariance structure and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
