How Private Is Your RL Policy? An Inverse RL Based Analysis Framework
Kritika Prakash, Fiza Husain, Praveen Paruchuri, Sujit P. Gujar

TL;DR
This paper introduces a framework for analyzing the privacy of RL policies by attempting to reconstruct original rewards, revealing potential privacy vulnerabilities in existing differentially-private RL algorithms across various domains.
Contribution
The paper proposes a novel Privacy-Aware Inverse RL framework that evaluates the privacy guarantees of RL policies through reward reconstruction attacks.
Findings
Current private RL algorithms show gaps in privacy protection.
Reward reconstruction can partially recover original rewards from private policies.
Analysis highlights the need for stronger privacy standards in RL.
Abstract
Reinforcement Learning (RL) enables agents to learn how to perform various tasks from scratch. In domains like autonomous driving, recommendation systems, and more, optimal RL policies learned could cause a privacy breach if the policies memorize any part of the private reward. We study the set of existing differentially-private RL policies derived from various RL algorithms such as Value Iteration, Deep Q Networks, and Vanilla Proximal Policy Optimization. We propose a new Privacy-Aware Inverse RL (PRIL) analysis framework, that performs reward reconstruction as an adversarial attack on private policies that the agents may deploy. For this, we introduce the reward reconstruction attack, wherein we seek to reconstruct the original reward from a privacy-preserving policy using an Inverse RL algorithm. An adversary must do poorly at reconstructing the original reward function if the agent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning
