Differentially Private Knowledge Distillation via Synthetic Text Generation

James Flemings; Murali Annavaram

arXiv:2403.00932·cs.LG·December 17, 2025·2 cites

Differentially Private Knowledge Distillation via Synthetic Text Generation

James Flemings, Murali Annavaram

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces DistilDP, a differentially private knowledge distillation method that uses synthetic data from a private teacher LLM to effectively compress models while maintaining privacy and utility.

Contribution

The paper proposes a novel DP knowledge distillation approach leveraging synthetic data and hidden representation alignment, improving utility over existing methods.

Findings

01

Significant utility improvement over baselines, reducing perplexity by at least 9.0 on Big Patent dataset.

02

Effective privacy-utility trade-off demonstrated at epsilon=2.

03

Progress in privacy-preserving compression of autoregressive LLMs.

Abstract

Large Language models (LLMs) are achieving state-of-the-art performance in many different downstream tasks. However, the increasing urgency of data privacy puts pressure on practitioners to train LLMs with Differential Privacy (DP) on private data. Concurrently, the exponential growth in parameter size of LLMs necessitates model compression before deployment of LLMs on resource-constrained devices or latency-sensitive applications. Differential privacy and model compression generally must trade off utility loss to achieve their objectives. Moreover, simultaneously applying both schemes can compound the utility degradation. To this end, we propose DistilDP: a novel differentially private knowledge distillation algorithm that exploits synthetic data generated by a differentially private teacher LLM. The knowledge of a teacher LLM is transferred onto the student in two ways: one way from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

james-flemings/dp_compress
pytorchOfficial

Videos

Differentially Private Knowledge Distillation via Synthetic Text Generation· underline

Taxonomy

TopicsTopic Modeling

MethodsKnowledge Distillation