Policy Gradient Methods for Information-Theoretic Opacity in Markov Decision Processes

Chongyang Shi; Sumukha Udupa; Michael R. Dorothy; Shuo Han; Jie Fu

arXiv:2511.02704·eess.SY·November 5, 2025

Policy Gradient Methods for Information-Theoretic Opacity in Markov Decision Processes

Chongyang Shi, Sumukha Udupa, Michael R. Dorothy, Shuo Han, Jie Fu

PDF

Open Access

TL;DR

This paper introduces an information-theoretic measure of opacity in Markov decision processes and develops algorithms to optimize control policies that maximize opacity while ensuring task performance.

Contribution

It proposes a new measure of opacity based on conditional entropy and presents algorithms for computing maximally opaque policies in MDPs, including convergence proofs.

Findings

01

Finite-memory policies can outperform Markov policies in opacity optimization.

02

The primal-dual gradient algorithm effectively computes maximally opaque policies.

03

Experimental results validate the effectiveness of the proposed methods.

Abstract

Opacity, or non-interference, is a property ensuring that an external observer cannot infer confidential information (the "secret") from system observations. We introduce an information-theoretic measure of opacity, which quantifies information leakage using the conditional entropy of the secret given the observer's partial observations in a system modeled as a Markov decision process (MDP). Our objective is to find a control policy that maximizes opacity while satisfying task performance constraints, assuming that an informed observer is aware of the control policy and system dynamics. Specifically, we consider a class of opacity called state-based opacity, where the secret is a propositional formula about the past or current state of the system, and a special case of state-based opacity called language-based opacity, where the secret is defined by a temporal logic formula (LTL) or a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Grid Security and Resilience · Petri Nets in System Modeling · Reinforcement Learning in Robotics