EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

Justin Fu; John D. Co-Reyes; and Sergey Levine

arXiv:1703.01260·cs.LG·May 30, 2017·61 cites

EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

Justin Fu, John D. Co-Reyes, and Sergey Levine

PDF

Open Access 1 Repo

TL;DR

This paper introduces a discriminative exemplar-based exploration method for deep reinforcement learning that effectively addresses sparse reward challenges, especially with high-dimensional observations like raw images.

Contribution

It presents a novel exploration algorithm using discriminatively trained exemplar models, avoiding complex generative models and achieving state-of-the-art results on challenging benchmarks.

Findings

01

Effective exploration in sparse reward environments

02

State-of-the-art results on vizDoom benchmark

03

Implicit density estimation via discriminative models

Abstract

Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes. However, sparse reward problems remain a significant challenge. Exploration methods based on novelty detection have been particularly successful in such settings but typically require generative or predictive models of the observations, which can be difficult to train when the observations are very high-dimensional and complex, as in the case of raw images. We propose a novelty detection algorithm for exploration that is based entirely on discriminatively trained exemplar models, where classifiers are trained to discriminate each visited state against all others. Intuitively, novel states are easier to distinguish against other states seen during training. We show that this kind of discriminative modeling corresponds to implicit density estimation, and that it can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jcoreyes/ex2
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Smart Grid Energy Management