Privacy-preserving Q-Learning with Functional Noise in Continuous State Spaces
Baoxiang Wang, Nidhi Hegde

TL;DR
This paper introduces a differentially private reinforcement learning algorithm for continuous state spaces that adds functional noise to protect reward information, ensuring privacy without depending on the size of the state space.
Contribution
It proposes a novel method of adding functional noise to the value function in continuous spaces, providing rigorous privacy guarantees independent of the number of queried states.
Findings
The algorithm achieves approximate optimality in discrete state spaces.
Experiments demonstrate improved privacy and utility over existing methods.
Theoretical analysis confirms privacy guarantees and utility bounds.
Abstract
We consider differentially private algorithms for reinforcement learning in continuous spaces, such that neighboring reward functions are indistinguishable. This protects the reward information from being exploited by methods such as inverse reinforcement learning. Existing studies that guarantee differential privacy are not extendable to infinite state spaces, as the noise level to ensure privacy will scale accordingly to infinity. Our aim is to protect the value function approximator, without regard to the number of states queried to the function. It is achieved by adding functional noise to the value function iteratively in the training. We show rigorous privacy guarantees by a series of analyses on the kernel of the noise space, the probabilistic bound of such noise samples, and the composition over the iterations. We gain insight into the utility analysis by proving the algorithm's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Distributed Sensor Networks and Detection Algorithms · Stochastic Gradient Optimization Techniques
