Differentially Private Subspace Fine-Tuning for Large Language Models

Lele Zheng; Xiang Wang; Tao Zhang; Yang Cao; Ke Cheng; Yulong Shen

arXiv:2601.11113·cs.LG·January 19, 2026

Differentially Private Subspace Fine-Tuning for Large Language Models

Lele Zheng, Xiang Wang, Tao Zhang, Yang Cao, Ke Cheng, Yulong Shen

PDF

Open Access

TL;DR

This paper introduces DP-SFT, a subspace fine-tuning method for large language models that injects differential privacy noise only into a low-dimensional task-specific subspace, improving privacy-utility trade-offs.

Contribution

The paper proposes a novel two-stage subspace fine-tuning approach that reduces noise impact while maintaining formal differential privacy guarantees in large language models.

Findings

01

DP-SFT improves accuracy under privacy constraints.

02

DP-SFT accelerates convergence during fine-tuning.

03

DP-SFT outperforms baseline DP fine-tuning methods.

Abstract

Fine-tuning large language models on downstream tasks is crucial for realizing their cross-domain potential but often relies on sensitive data, raising privacy concerns. Differential privacy (DP) offers rigorous privacy guarantees and has been widely adopted in fine-tuning; however, naively injecting noise across the high-dimensional parameter space creates perturbations with large norms, degrading performance and destabilizing training. To address this issue, we propose DP-SFT, a two-stage subspace fine-tuning method that substantially reduces noise magnitude while preserving formal DP guarantees. Our intuition is that, during fine-tuning, significant parameter updates lie within a low-dimensional, task-specific subspace, while other directions change minimally. Hence, we only inject DP noise into this subspace to protect privacy without perturbing irrelevant parameters. In phase one,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Topic Modeling · Adversarial Robustness in Machine Learning