Provably Efficient Maximum Entropy Exploration
Elad Hazan, Sham M. Kakade, Karan Singh, Abby Van Soest

TL;DR
This paper introduces an efficient algorithm for maximizing entropy-based objectives in unknown MDPs without rewards, leveraging a black box planner and the Frank-Wolfe method, with proven efficiency in tabular settings.
Contribution
It develops a provably efficient algorithm for intrinsic, entropy-based exploration objectives in MDPs, using a black box planner and the Frank-Wolfe method.
Findings
Algorithm is efficient with respect to sample and computational complexity in tabular MDPs.
Provides a black box planning approach for intrinsic exploration objectives.
Proves theoretical guarantees for the proposed method.
Abstract
Suppose an agent is in a (possibly unknown) Markov Decision Process in the absence of a reward signal, what might we hope that an agent can efficiently learn to do? This work studies a broad class of objectives that are defined solely as functions of the state-visitation frequencies that are induced by how the agent behaves. For example, one natural, intrinsically defined, objective problem is for the agent to learn a policy which induces a distribution over state space that is as uniform as possible, which can be measured in an entropic sense. We provide an efficient algorithm to optimize such such intrinsically defined objectives, when given access to a black box planning oracle (which is robust to function approximation). Furthermore, when restricted to the tabular setting where we have sample based access to the MDP, our proposed algorithm is provably efficient, both in terms of its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Sparse and Compressive Sensing Techniques · CCD and CMOS Imaging Sensors
