DORA The Explorer: Directed Outreaching Reinforcement Action-Selection

Leshem Choshen; Lior Fox; Yonatan Loewenstein

arXiv:1804.04012·cs.LG·April 12, 2018·37 cites

DORA The Explorer: Directed Outreaching Reinforcement Action-Selection

Leshem Choshen, Lior Fox, Yonatan Loewenstein

PDF

Open Access 2 Repos

TL;DR

This paper introduces E-values, a novel model-free method for directed exploration in reinforcement learning that improves learning efficiency and performance, especially in continuous environments like Atari games.

Contribution

The paper proposes E-values as a generalization of counters for directed exploration, addressing their locality issue and enabling efficient learning in continuous MDPs.

Findings

01

E-values outperform traditional counters in exploration tasks.

02

E-values improve learning speed and performance in Atari 2600 games.

03

Method can be integrated with function approximation for continuous environments.

Abstract

Exploration is a fundamental aspect of Reinforcement Learning, typically implemented using stochastic action-selection. Exploration, however, can be more efficient if directed toward gaining new world knowledge. Visit-counters have been proven useful both in practice and in theory for directed exploration. However, a major limitation of counters is their locality. While there are a few model-based solutions to this shortcoming, a model-free approach is still missing. We propose $E$ -values, a generalization of counters that can be used to evaluate the propagating exploratory value over state-action trajectories. We compare our approach to commonly used RL techniques, and show that using $E$ -values improves learning and performance over traditional counters. We also show how our method can be implemented with function approximation to efficiently learn continuous MDPs. We demonstrate this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOpen Source Software Innovations