Differentially Private Subspace Fine-Tuning for Large Language Models
Lele Zheng, Xiang Wang, Tao Zhang, Yang Cao, Ke Cheng, Yulong Shen

TL;DR
This paper introduces DP-SFT, a subspace fine-tuning method for large language models that injects differential privacy noise only into a low-dimensional task-specific subspace, improving privacy-utility trade-offs.
Contribution
The paper proposes a novel two-stage subspace fine-tuning approach that reduces noise impact while maintaining formal differential privacy guarantees in large language models.
Findings
DP-SFT improves accuracy under privacy constraints.
DP-SFT accelerates convergence during fine-tuning.
DP-SFT outperforms baseline DP fine-tuning methods.
Abstract
Fine-tuning large language models on downstream tasks is crucial for realizing their cross-domain potential but often relies on sensitive data, raising privacy concerns. Differential privacy (DP) offers rigorous privacy guarantees and has been widely adopted in fine-tuning; however, naively injecting noise across the high-dimensional parameter space creates perturbations with large norms, degrading performance and destabilizing training. To address this issue, we propose DP-SFT, a two-stage subspace fine-tuning method that substantially reduces noise magnitude while preserving formal DP guarantees. Our intuition is that, during fine-tuning, significant parameter updates lie within a low-dimensional, task-specific subspace, while other directions change minimally. Hence, we only inject DP noise into this subspace to protect privacy without perturbing irrelevant parameters. In phase one,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Topic Modeling · Adversarial Robustness in Machine Learning
