Privacy-preserving Neural Representations of Text
Maximin Coavoux, Shashi Narayan, Shay B. Cohen

TL;DR
This paper investigates privacy risks in neural NLP models by analyzing how hidden representations can leak private information and proposes defense strategies to enhance privacy without significantly sacrificing utility.
Contribution
It introduces a framework for measuring privacy leakage from neural representations and proposes novel training methods to defend against such attacks.
Findings
Modified training objectives improve privacy of neural representations.
Tradeoff between privacy and utility is characterized.
Defense methods increase privacy without major utility loss.
Abstract
This article deals with adversarial attacks towards deep learning systems for Natural Language Processing (NLP), in the context of privacy protection. We study a specific type of attack: an attacker eavesdrops on the hidden representations of a neural text classifier and tries to recover information about the input text. Such scenario may arise in situations when the computation of a neural network is shared across multiple devices, e.g. some hidden representation is computed by a user's device and sent to a cloud-based model. We measure the privacy of a hidden representation by the ability of an attacker to predict accurately specific private information from it and characterize the tradeoff between the privacy and the utility of neural representations. Finally, we propose several defense methods based on modified training objectives and show that they improve the privacy of neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Ethics and Social Impacts of AI
