Count-Based Exploration with Neural Density Models

Georg Ostrovski; Marc G. Bellemare; Aaron van den Oord; Remi Munos

arXiv:1703.01310·cs.AI·June 15, 2017·222 cites

Count-Based Exploration with Neural Density Models

Georg Ostrovski, Marc G. Bellemare, Aaron van den Oord, Remi Munos

PDF

Open Access 1 Repo

TL;DR

This paper enhances count-based exploration in reinforcement learning by using neural density models, specifically PixelCNN, to improve pseudo-count accuracy, and demonstrates significant performance gains on challenging Atari games.

Contribution

It introduces the use of PixelCNN for pseudo-counts in exploration, addressing model quality issues, and highlights the effectiveness of Monte Carlo updates in sparse reward environments.

Findings

01

PixelCNN pseudo-counts improve exploration performance

02

Mixed Monte Carlo updates facilitate exploration in sparse settings

03

State-of-the-art results on several Atari games

Abstract

Bellemare et al. (2016) introduced the notion of a pseudo-count, derived from a density model, to generalize count-based exploration to non-tabular reinforcement learning. This pseudo-count was used to generate an exploration bonus for a DQN agent and combined with a mixed Monte Carlo update was sufficient to achieve state of the art on the Atari 2600 game Montezuma's Revenge. We consider two questions left open by their work: First, how important is the quality of the density model for exploration? Second, what role does the Monte Carlo update play in exploration? We answer the first question by demonstrating the use of PixelCNN, an advanced neural density model for images, to supply a pseudo-count. In particular, we examine the intrinsic difficulties in adapting Bellemare et al.'s approach when assumptions about the model are violated. The result is a more practical and general…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nolisten/erl
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)

MethodsQ-Learning · Dense Connections · Convolution · PixelCNN · Deep Q-Network