Information-Theoretic Opacity-Enforcement in Markov Decision Processes
Chongyang Shi, Yuheng Bu, Jie Fu

TL;DR
This paper introduces information-theoretic methods to enforce opacity in Markov decision processes, balancing privacy of secrets with system performance using novel policy gradient algorithms.
Contribution
It develops primal-dual policy gradient algorithms for opacity enforcement in MDPs, leveraging message passing for efficient entropy gradient computation.
Findings
Algorithms achieve stable and fast convergence.
Effective enforcement of opacity in grid world example.
Balances privacy and system return constraints.
Abstract
The paper studies information-theoretic opacity, an information-flow privacy property, in a setting involving two agents: A planning agent who controls a stochastic system and an observer who partially observes the system states. The goal of the observer is to infer some secret, represented by a random variable, from its partial observations, while the goal of the planning agent is to make the secret maximally opaque to the observer while achieving a satisfactory total return. Modeling the stochastic system using a Markov decision process, two classes of opacity properties are considered -- Last-state opacity is to ensure that the observer is uncertain if the last state is in a specific set and initial-state opacity is to ensure that the observer is unsure of the realization of the initial state. As the measure of opacity, we employ the Shannon conditional entropy capturing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Reinforcement Learning in Robotics · Advanced Research in Systems and Signal Processing
MethodsSparse Evolutionary Training
