Implicit Quantile Networks for Distributional Reinforcement Learning

Will Dabney; Georg Ostrovski; David Silver; R\'emi Munos

arXiv:1806.06923·cs.LG·June 20, 2018·197 cites

Implicit Quantile Networks for Distributional Reinforcement Learning

Will Dabney, Georg Ostrovski, David Silver, R\'emi Munos

PDF

Open Access 5 Repos 1 Models

TL;DR

This paper introduces Implicit Quantile Networks, a novel distributional reinforcement learning method that uses quantile regression to model return distributions, leading to improved Atari game performance and enabling risk-sensitive policies.

Contribution

It presents a flexible, state-of-the-art distributional RL approach using implicit quantile functions, advancing the modeling of return distributions and risk sensitivity.

Findings

01

Achieved superior performance on 57 Atari games.

02

Enabled analysis of risk-sensitive policies.

03

Demonstrated the effectiveness of implicit quantile modeling.

Abstract

In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution. By reparameterizing a distribution over the sample space, this yields an implicitly defined return distribution and gives rise to a large class of risk-sensitive policies. We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm's implicitly defined distributions to study the effects of risk-sensitive policies in Atari games.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
TheoVincent/Atari_i-QN
model· 1 dl· ♡ 2
1 dl♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Neural Networks and Applications

MethodsConvolution · Dense Connections · Q-Learning · Deep Q-Network