Differentially Private Next-Token Prediction of Large Language Models

James Flemings; Meisam Razaviyayn; Murali Annavaram

arXiv:2403.15638·cs.CR·April 30, 2024·2 cites

Differentially Private Next-Token Prediction of Large Language Models

James Flemings, Meisam Razaviyayn, Murali Annavaram

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces PMixED, a novel privacy-preserving prediction protocol for large language models that leverages stochastic sampling and ensemble distributions to achieve differential privacy without retraining the model.

Contribution

The paper proposes PMixED, a model-agnostic, privacy-preserving next-token prediction method that outperforms DP-SGD in privacy guarantees and utility on large-scale datasets.

Findings

01

PMixED achieves stronger privacy guarantees than sample-level privacy.

02

PMixED outperforms DP-SGD at privacy ε=8 on large datasets.

03

PMixED is model-agnostic and suitable for current cloud-based LLM deployments.

Abstract

Ensuring the privacy of Large Language Models (LLMs) is becoming increasingly important. The most widely adopted technique to accomplish this is DP-SGD, which trains a model to guarantee Differential Privacy (DP). However, DP-SGD overestimates an adversary's capabilities in having white box access to the model and, as a result, causes longer training times and larger memory usage than SGD. On the other hand, commercial LLM deployments are predominantly cloud-based; hence, adversarial access to LLMs is black-box. Motivated by these observations, we present Private Mixing of Ensemble Distributions (PMixED): a private prediction protocol for next-token prediction that utilizes the inherent stochasticity of next-token sampling and a public model to achieve Differential Privacy. We formalize this by introducing RD-mollifers which project each of the model's output distribution from an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

james-flemings/pmixed
pytorchOfficial

Videos

Differentially Private Next-Token Prediction of Large Language Models· underline

Taxonomy

TopicsPrivacy-Preserving Technologies in Data

MethodsSparse Evolutionary Training · Stochastic Gradient Descent