Exploration in Feature Space for Reinforcement Learning
Suraj Narayanan Sasikumar

TL;DR
This paper introduces a novel feature space-based exploration method for reinforcement learning that generalizes visit counts using state features, enabling efficient exploration in high-dimensional environments like Atari games.
Contribution
It proposes a $ extit{pseudocount}$ method leveraging feature representations for scalable, generalizable exploration in high-dimensional RL tasks.
Findings
Achieves near state-of-the-art results on high-dimensional benchmarks.
Demonstrates success on difficult Atari games like Montezuma's Revenge.
Offers a computationally efficient exploration algorithm.
Abstract
The infamous exploration-exploitation dilemma is one of the oldest and most important problems in reinforcement learning (RL). Deliberate and effective exploration is necessary for RL agents to succeed in most environments. However, until very recently even very sophisticated RL algorithms employed simple, undirected exploration strategies in large-scale RL tasks. We introduce a new optimistic count-based exploration algorithm for RL that is feasible in high-dimensional MDPs. The success of RL algorithms in these domains depends crucially on generalization from limited training experience. Function approximation techniques enable RL agents to generalize in order to estimate the value of unvisited states, but at present few methods have achieved generalization about the agent's uncertainty regarding unvisited states. We present a new method for computing a generalized state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Data Stream Mining Techniques
