TL;DR
This paper introduces RLDP, a reinforcement learning framework that adaptively manages privacy-preserving fine-tuning of large language models, improving utility and efficiency while maintaining formal differential privacy guarantees.
Contribution
RLDP is the first method to treat differential privacy optimization as a closed-loop control problem using deep RL, enabling adaptive privacy-utility trade-offs during LLM fine-tuning.
Findings
Achieves 5.6% utility improvement on average.
Speeds up training by 71% compared to baselines.
Maintains privacy guarantees while reducing vulnerability to attacks.
Abstract
The tension between data privacy and model utility has become the defining bottleneck for the practical deployment of large language models (LLMs) trained on sensitive corpora including healthcare. Differentially private stochastic gradient descent (DP-SGD) guarantees formal privacy, yet it does so at a pronounced cost: gradients are forcibly clipped and perturbed with noise, degrading sample efficiency and final accuracy. Numerous variants have been proposed to soften this trade-off, but they all share a handicap: their control knobs are hard-coded, global, and oblivious to the evolving optimization landscape. Consequently, practitioners are forced either to over-spend privacy budget in pursuit of utility, or to accept mediocre models in order to stay within privacy constraints. We present RLDP, the first framework to cast DP optimization itself as a closed-loop control problem…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
