Smoother Entropy for Active State Trajectory Estimation and Obfuscation in POMDPs
Timothy L. Molloy, Girish N. Nair

TL;DR
This paper introduces a novel approach to controlling POMDPs by optimizing the smoother entropy, enabling effective active state estimation and obfuscation with tractable solutions.
Contribution
It develops new expressions for smoother entropy in terms of belief states, reformulating estimation and obfuscation as belief-state MDPs with tractable properties.
Findings
Optimizing smoother entropy improves state trajectory estimation.
The approach effectively obfuscates state information.
Reformulation enables use of standard POMDP algorithms.
Abstract
We study the problem of controlling a partially observed Markov decision process (POMDP) to either aid or hinder the estimation of its state trajectory. We encode the estimation objectives via the smoother entropy, which is the conditional entropy of the state trajectory given measurements and controls. Consideration of the smoother entropy contrasts with previous approaches that instead resort to marginal (or instantaneous) state entropies due to tractability concerns. By establishing novel expressions for the smoother entropy in terms of the POMDP belief state, we show that both the problems of minimising and maximising the smoother entropy in POMDPs can surprisingly be reformulated as belief-state Markov decision processes with concave cost and value functions. The significance of these reformulations is that they render the smoother entropy a tractable optimisation objective, with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Bayesian Modeling and Causal Inference · Adversarial Robustness in Machine Learning
