The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough
Riccardo Zamboni, Duilio Cirino, Marcello Restelli, Mirco Mutti

TL;DR
This paper investigates the limitations and potential of maximizing observation entropy in partially observable Markov decision processes, providing theoretical bounds and a regularization approach to enhance exploration under partial observability.
Contribution
It introduces a theoretical framework for approximating true state entropy using observation entropy and proposes a regularization method leveraging observation function properties.
Findings
Derived bounds for state entropy approximation based on observation properties
Proposed a regularization technique to improve exploration in POMDPs
Provided theoretical insights into the intrinsic limits of observation-based entropy maximization
Abstract
The problem of pure exploration in Markov decision processes has been cast as maximizing the entropy over the state distribution induced by the agent's policy, an objective that has been extensively studied. However, little attention has been dedicated to state entropy maximization under partial observability, despite the latter being ubiquitous in applications, e.g., finance and robotics, in which the agent only receives noisy observations of the true state governing the system's dynamics. How can we address state entropy maximization in those domains? In this paper, we study the simple approach of maximizing the entropy over observations in place of true latent states. First, we provide lower and upper bounds to the approximation of the true state entropy that only depends on some properties of the observation function. Then, we show how knowledge of the latter can be exploited to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemiconductor Lasers and Optical Devices · Analytical Chemistry and Sensors · Optical Network Technologies
MethodsSoftmax · Attention Is All You Need
