Exploration Conscious Reinforcement Learning Revisited

Lior Shani; Yonathan Efroni; Shie Mannor

arXiv:1812.05551·cs.LG·September 10, 2019·5 cites

Exploration Conscious Reinforcement Learning Revisited

Lior Shani, Yonathan Efroni, Shie Mannor

PDF

Open Access 1 Repo

TL;DR

This paper revisits exploration strategies in reinforcement learning by introducing exploration-conscious criteria, leading to policies optimized for specific exploration mechanisms and demonstrating improved empirical performance over traditional methods.

Contribution

It proposes exploration-conscious criteria, formulates them as surrogate MDPs, and adapts existing algorithms to achieve superior results in both discrete and continuous settings.

Findings

01

Exploration-conscious policies outperform standard methods.

02

Simple modifications to algorithms yield significant performance gains.

03

Applicable to both tabular and deep RL in discrete and continuous spaces.

Abstract

The Exploration-Exploitation tradeoff arises in Reinforcement Learning when one cannot tell if a policy is optimal. Then, there is a constant need to explore new actions instead of exploiting past experience. In practice, it is common to resolve the tradeoff by using a fixed exploration mechanism, such as $ϵ$ -greedy exploration or by adding Gaussian noise, while still trying to learn an optimal policy. In this work, we take a different approach and study exploration-conscious criteria, that result in optimal policies with respect to the exploration mechanism. Solving these criteria, as we establish, amounts to solving a surrogate Markov Decision Process. We continue and analyze properties of exploration-conscious optimal policies and characterize two general approaches to solve such criteria. Building on the approaches, we apply simple changes in existing tabular and deep…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shanlior/ExplorationConsciousRL
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Explainable Artificial Intelligence (XAI)