Least Inferable Policies for Markov Decision Processes
Mustafa O. Karabag, Melkior Ornik, Ufuk Topcu

TL;DR
This paper develops a convex optimization approach to synthesize policies in Markov decision processes that minimize information leakage to an observer, enhancing privacy while ensuring reachability.
Contribution
It introduces a Fisher information-based metric for policy synthesis that reduces an observer's ability to infer transition probabilities in MDPs.
Findings
Expected total information is inversely proportional to observer's estimation error.
The method effectively minimizes information leakage while satisfying reachability constraints.
Analysis confirms the metric's effectiveness in quantifying information leakage.
Abstract
In a variety of applications, an agent's success depends on the knowledge that an adversarial observer has or can gather about the agent's decisions. It is therefore desirable for the agent to achieve a task while reducing the ability of an observer to infer the agent's policy. We consider the task of the agent as a reachability problem in a Markov decision process and study the synthesis of policies that minimize the observer's ability to infer the transition probabilities of the agent between the states of the Markov decision process. We introduce a metric that is based on the Fisher information as a proxy for the information leaked to the observer and using this metric formulate a problem that minimizes expected total information subject to the reachability constraint. We proceed to solve the problem using convex optimization methods. To verify the proposed method, we analyze the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
