Private Language Models via Truncated Laplacian Mechanism
Tianhao Huang, Tao Yang, Ivan Habernal, Lijie Hu, and Di Wang

TL;DR
This paper introduces a novel high-dimensional truncated Laplacian mechanism for private word embeddings, achieving better privacy-utility trade-offs in NLP models under differential privacy constraints.
Contribution
It extends the truncated Laplacian mechanism to high-dimensional spaces, providing a new private embedding method with lower variance and improved performance in high privacy regimes.
Findings
Lower variance compared to previous methods
Maintains high utility even in high privacy regimes
Effective on multiple datasets and downstream tasks
Abstract
Deep learning models for NLP tasks are prone to variants of privacy attacks. To prevent privacy leakage, researchers have investigated word-level perturbations, relying on the formal guarantees of differential privacy (DP) in the embedding space. However, many existing approaches either achieve unsatisfactory performance in the high privacy regime when using the Laplacian or Gaussian mechanism, or resort to weaker relaxations of DP that are inferior to the canonical DP in terms of privacy strength. This raises the question of whether a new method for private word embedding can be designed to overcome these limitations. In this paper, we propose a novel private embedding method called the high dimensional truncated Laplacian mechanism. Specifically, we introduce a non-trivial extension of the truncated Laplacian mechanism, which was previously only investigated in one-dimensional space…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
