RL-Finetuned LLMs for Privacy-Preserving Synthetic Rewriting
Zhan Shi, Yefeng Yuan, Yuhong Liu, Liang Cheng, Yi Fang

TL;DR
This paper introduces a reinforcement learning method to fine-tune large language models for generating privacy-preserving synthetic text that balances data utility and privacy protection.
Contribution
It proposes a novel RL framework that jointly optimizes for privacy, semantic fidelity, and diversity using a composite reward function incorporating structural privacy cues.
Findings
Enhanced author obfuscation and privacy metrics
Maintained semantic quality of generated text
Scalable, model-agnostic privacy-preserving data generation
Abstract
The performance of modern machine learning systems depends on access to large, high-quality datasets, often sourced from user-generated content or proprietary, domain-specific corpora. However, these rich datasets inherently contain sensitive personal information, raising significant concerns about privacy, data security, and compliance with regulatory frameworks. While conventional anonymization techniques can remove explicit identifiers, such removal may result in performance drop in downstream machine learning tasks. More importantly, simple anonymization may not be effective against inference attacks that exploit implicit signals such as writing style, topical focus, or demographic cues, highlighting the need for more robust privacy safeguards during model training. To address the challenging issue of balancing user privacy and data utility, we propose a reinforcement learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
