Exploration Conscious Reinforcement Learning Revisited
Lior Shani, Yonathan Efroni, Shie Mannor

TL;DR
This paper revisits exploration strategies in reinforcement learning by introducing exploration-conscious criteria, leading to policies optimized for specific exploration mechanisms and demonstrating improved empirical performance over traditional methods.
Contribution
It proposes exploration-conscious criteria, formulates them as surrogate MDPs, and adapts existing algorithms to achieve superior results in both discrete and continuous settings.
Findings
Exploration-conscious policies outperform standard methods.
Simple modifications to algorithms yield significant performance gains.
Applicable to both tabular and deep RL in discrete and continuous spaces.
Abstract
The Exploration-Exploitation tradeoff arises in Reinforcement Learning when one cannot tell if a policy is optimal. Then, there is a constant need to explore new actions instead of exploiting past experience. In practice, it is common to resolve the tradeoff by using a fixed exploration mechanism, such as -greedy exploration or by adding Gaussian noise, while still trying to learn an optimal policy. In this work, we take a different approach and study exploration-conscious criteria, that result in optimal policies with respect to the exploration mechanism. Solving these criteria, as we establish, amounts to solving a surrogate Markov Decision Process. We continue and analyze properties of exploration-conscious optimal policies and characterize two general approaches to solve such criteria. Building on the approaches, we apply simple changes in existing tabular and deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Explainable Artificial Intelligence (XAI)
