Policy Gradient Methods for Information-Theoretic Opacity in Markov Decision Processes
Chongyang Shi, Sumukha Udupa, Michael R. Dorothy, Shuo Han, Jie Fu

TL;DR
This paper introduces an information-theoretic measure of opacity in Markov decision processes and develops algorithms to optimize control policies that maximize opacity while ensuring task performance.
Contribution
It proposes a new measure of opacity based on conditional entropy and presents algorithms for computing maximally opaque policies in MDPs, including convergence proofs.
Findings
Finite-memory policies can outperform Markov policies in opacity optimization.
The primal-dual gradient algorithm effectively computes maximally opaque policies.
Experimental results validate the effectiveness of the proposed methods.
Abstract
Opacity, or non-interference, is a property ensuring that an external observer cannot infer confidential information (the "secret") from system observations. We introduce an information-theoretic measure of opacity, which quantifies information leakage using the conditional entropy of the secret given the observer's partial observations in a system modeled as a Markov decision process (MDP). Our objective is to find a control policy that maximizes opacity while satisfying task performance constraints, assuming that an informed observer is aware of the control policy and system dynamics. Specifically, we consider a class of opacity called state-based opacity, where the secret is a propositional formula about the past or current state of the system, and a special case of state-based opacity called language-based opacity, where the secret is defined by a temporal logic formula (LTL) or a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Grid Security and Resilience · Petri Nets in System Modeling · Reinforcement Learning in Robotics
