Selective Differential Privacy for Language Modeling
Weiyan Shi, Aiqi Cui, Evan Li, Ruoxi Jia, Zhou Yu

TL;DR
This paper introduces selective differential privacy for language models, focusing privacy guarantees on sensitive data parts to enhance utility while maintaining privacy, demonstrated through experiments on language modeling and dialog systems.
Contribution
It proposes a new privacy notion, selective differential privacy, and develops a corresponding mechanism, Selective-DPSGD, to improve privacy-utility trade-offs in language models.
Findings
Better utility achieved compared to baseline methods.
Maintains privacy under various attack scenarios.
Effective in both language modeling and dialog systems.
Abstract
With the increasing applications of language models, it has become crucial to protect these models from leaking private information. Previous work has attempted to tackle this challenge by training RNN-based language models with differential privacy guarantees. However, applying classical differential privacy to language models leads to poor model performance as the underlying privacy notion is over-pessimistic and provides undifferentiated protection for all tokens in the data. Given that the private information in natural language is sparse (for example, the bulk of an email might not carry personally identifiable information), we propose a new privacy notion, selective differential privacy, to provide rigorous privacy guarantees on the sensitive portion of the data to improve model utility. To realize such a new notion, we develop a corresponding privacy mechanism, Selective-DPSGD,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Access Control and Trust
