RewardDS: Privacy-Preserving Fine-Tuning for Large Language Models via Reward Driven Data Synthesis

Jianwei Wang; Chengming Shi; Junyao Yang; Haoran Li; Qianli Ma; Huiping Zhuang; Cen Chen; Ziqian Zeng

arXiv:2502.18517·cs.CR·September 3, 2025

RewardDS: Privacy-Preserving Fine-Tuning for Large Language Models via Reward Driven Data Synthesis

Jianwei Wang, Chengming Shi, Junyao Yang, Haoran Li, Qianli Ma, Huiping Zhuang, Cen Chen, Ziqian Zeng

PDF

Open Access 1 Video

TL;DR

RewardDS is a novel framework that enhances privacy-preserving data synthesis for fine-tuning large language models by using reward signals to filter and refine synthetic data, improving quality in sensitive domains.

Contribution

It introduces a reward-guided filtering and self-optimizing refinement approach to generate high-quality synthetic data with differential privacy guarantees.

Findings

01

Effective noise mitigation in synthetic data

02

Improved fine-tuning performance in sensitive domains

03

Demonstrated success across multiple application areas

Abstract

The success of large language models (LLMs) has attracted many individuals to fine-tune them for domain-specific tasks by uploading their data. However, in sensitive areas like healthcare and finance, privacy concerns often arise. One promising solution is to generate synthetic data with Differential Privacy (DP) guarantees to replace private data. However, these synthetic data contain significant flawed data, which are considered as noise. Existing solutions typically rely on naive filtering by comparing ROUGE-L scores or embedding similarities, which are ineffective in addressing the noise. To address this issue, we propose \textit{RewardDS}, a novel privacy-preserving framework that fine-tunes a reward proxy model and uses reward signals to guide the synthetic data generation. Our \textit{RewardDS} introduces two key modules, Reward Guided Filtering and Self-Optimizing Refinement, to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

RewardDS: Privacy-Preserving Fine-Tuning for Large Language Models via Reward Driven Data Synthesis· underline

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Artificial Intelligence in Healthcare and Education · Topic Modeling