Privacy Preserving Reinforcement Learning with One-Sided Feedback
Lin William Cong, Guangyan Gan, Hanzhang Qin, Zhenzhen Yan

TL;DR
This paper introduces POOL, a privacy-preserving reinforcement learning algorithm designed for complex environments with partial feedback, achieving strong privacy without sacrificing learning efficiency.
Contribution
The paper proposes POOL, a novel RL algorithm that balances privacy preservation with sample efficiency in multi-dimensional, one-sided feedback settings.
Findings
Sample complexity bound matches known non-private RL lower bounds.
Strong privacy guarantees are achievable alongside high learning efficiency.
Addresses challenges in privacy and efficiency in complex RL environments.
Abstract
We study reinforcement learning (RL) in multi-dimensional continuous state and action spaces with one-sided feedback, where the agent receives partial observations of the state and obtains reward information for only a subset of the state-action space at each time step. This setting introduces substantial challenges in both learning efficiency and privacy preservation. To address these challenges, we propose POOL, a novel privacy-preserving RL algorithm. We conduct a comprehensive theoretical analysis of POOL, deriving a sample complexity bound that matches the known lower bounds for non-private RL. Here, E_rho denotes the privacy parameter, H is the time horizon, and alpha is the optimality-gap parameter. Our findings show that it is possible to enforce strong privacy guarantees while maintaining high learning efficiency, marking a significant step toward practical, privacy-aware RL in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
