Relative Entropy Regularized Reinforcement Learning for Efficient Encrypted Policy Synthesis
Jihoon Suh, Yeongjun Jang, Kaoru Teranishi, Takashi Tanaka

TL;DR
This paper introduces a novel method for privacy-preserving reinforcement learning that integrates fully homomorphic encryption with a relative-entropy-regularized framework, enabling efficient encrypted policy synthesis with theoretical guarantees.
Contribution
It presents a new encrypted reinforcement learning framework that simplifies value iteration and provides convergence analysis for privacy-preserving policy development.
Findings
Effective integration of FHE with RL for encrypted policies
Theoretical convergence and error bounds established
Numerical simulations validate the approach
Abstract
We propose an efficient encrypted policy synthesis to develop privacy-preserving model-based reinforcement learning. We first demonstrate that the relative-entropy-regularized reinforcement learning framework offers a computationally convenient linear and ``min-free'' structure for value iteration, enabling a direct and efficient integration of fully homomorphic encryption with bootstrapping into policy synthesis. Convergence and error bounds are analyzed as encrypted policy synthesis propagates errors under the presence of encryption-induced errors including quantization and bootstrapping. Theoretical analysis is validated by numerical simulations. Results demonstrate the effectiveness of the RERL framework in integrating FHE for encrypted policy synthesis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
