Distributional Reinforcement Learning with Quantile Regression

Will Dabney; Mark Rowland; Marc G. Bellemare; R\'emi Munos

arXiv:1710.10044·cs.AI·October 30, 2017·150 cites

Distributional Reinforcement Learning with Quantile Regression

Will Dabney, Mark Rowland, Marc G. Bellemare, R\'emi Munos

PDF

Open Access 5 Repos

TL;DR

This paper advances distributional reinforcement learning by developing a new algorithm based on quantile regression, which models the return distribution explicitly and outperforms existing methods like C51 on Atari games.

Contribution

It introduces a novel distributional RL algorithm using quantile regression, extending theoretical results to approximate distributions and demonstrating superior performance.

Findings

01

The new algorithm outperforms C51 on Atari 2600 games.

02

Theoretical results are extended to approximate distribution settings.

03

Empirical results show significant improvements over recent methods.

Abstract

In reinforcement learning an agent interacts with the environment by taking actions and observing the next state and reward. When sampled probabilistically, these state transitions, rewards, and actions can all induce randomness in the observed long-term return. Traditionally, reinforcement learning algorithms average over this randomness to estimate the value function. In this paper, we build on recent work advocating a distributional approach to reinforcement learning in which the distribution over returns is modeled explicitly instead of only estimating the mean. That is, we examine methods of learning the value distribution instead of the value function. We give results that close a number of gaps between the theoretical and algorithmic results given by Bellemare, Dabney, and Munos (2017). First, we extend existing results to the approximate distribution setting. Second, we present…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSports Analytics and Performance · Reinforcement Learning in Robotics

MethodsConvolution · Dense Connections · Q-Learning · Deep Q-Network