How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing
Samuel Sousa, Roman Kern

TL;DR
This systematic review categorizes and analyzes over sixty deep learning methods for privacy-preserving NLP, highlighting their foundations, challenges, and future directions to enhance data privacy in language models.
Contribution
Introduces a novel taxonomy for classifying privacy-preserving NLP methods and provides an extensive overview of privacy threats, datasets, and evaluation metrics.
Findings
Classified methods into data safeguarding, trusted, and verification categories.
Identified key privacy threats and evaluation metrics in NLP.
Discussed open challenges like data traceability and privacy-utility tradeoff.
Abstract
Deep learning (DL) models for natural language processing (NLP) tasks often handle private data, demanding protection against breaches and disclosures. Data protection laws, such as the European Union's General Data Protection Regulation (GDPR), thereby enforce the need for privacy. Although many privacy-preserving NLP methods have been proposed in recent years, no categories to organize them have been introduced yet, making it hard to follow the progress of the literature. To close this gap, this article systematically reviews over sixty DL methods for privacy-preserving NLP published between 2016 and 2020, covering theoretical foundations, privacy-enhancing technologies, and analysis of their suitability for real-world scenarios. First, we introduce a novel taxonomy for classifying the existing methods into three categories: data safeguarding methods, trusted methods, and verification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
