Differentially Private Reinforcement Learning with Linear Function   Approximation

Xingyu Zhou

arXiv:2201.07052·cs.LG·March 22, 2022

Differentially Private Reinforcement Learning with Linear Function Approximation

Xingyu Zhou

PDF

Open Access

TL;DR

This paper introduces privacy-preserving reinforcement learning algorithms for large-scale MDPs with linear function approximation, achieving sub-linear regret while protecting user data under joint differential privacy.

Contribution

It develops the first private RL algorithms for large state-action spaces with linear approximation, ensuring privacy with regret bounds independent of state space size.

Findings

01

Algorithms achieve sub-linear regret with privacy guarantees.

02

Regret bounds scale logarithmically with the number of actions.

03

Methods are suitable for large-scale personalized services.

Abstract

Motivated by the wide adoption of reinforcement learning (RL) in real-world personalized services, where users' sensitive and private information needs to be protected, we study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP). Compared to existing private RL algorithms that work only on tabular finite-state, finite-actions MDPs, we take the first step towards privacy-preserving learning in MDPs with large state and action spaces. Specifically, we consider MDPs with linear function approximation (in particular linear mixture MDPs) under the notion of joint differential privacy (JDP), where the RL agent is responsible for protecting users' sensitive data. We design two private RL algorithms that are based on value iteration and policy optimization, respectively, and show that they enjoy sub-linear regret performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Age of Information Optimization · Vehicular Ad Hoc Networks (VANETs)