Private Language Models via Truncated Laplacian Mechanism

Tianhao Huang; Tao Yang; Ivan Habernal; Lijie Hu; and Di Wang

arXiv:2410.08027·cs.CL·October 11, 2024

Private Language Models via Truncated Laplacian Mechanism

Tianhao Huang, Tao Yang, Ivan Habernal, Lijie Hu, and Di Wang

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel high-dimensional truncated Laplacian mechanism for private word embeddings, achieving better privacy-utility trade-offs in NLP models under differential privacy constraints.

Contribution

It extends the truncated Laplacian mechanism to high-dimensional spaces, providing a new private embedding method with lower variance and improved performance in high privacy regimes.

Findings

01

Lower variance compared to previous methods

02

Maintains high utility even in high privacy regimes

03

Effective on multiple datasets and downstream tasks

Abstract

Deep learning models for NLP tasks are prone to variants of privacy attacks. To prevent privacy leakage, researchers have investigated word-level perturbations, relying on the formal guarantees of differential privacy (DP) in the embedding space. However, many existing approaches either achieve unsatisfactory performance in the high privacy regime when using the Laplacian or Gaussian mechanism, or resort to weaker relaxations of DP that are inferior to the canonical DP in terms of privacy strength. This raises the question of whether a new method for private word embedding can be designed to overcome these limitations. In this paper, we propose a novel private embedding method called the high dimensional truncated Laplacian mechanism. Specifically, we introduce a non-trivial extension of the truncated Laplacian mechanism, which was previously only investigated in one-dimensional space…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Private Language Models via Truncated Laplacian Mechanism· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis