Benchmarking Differential Privacy and Federated Learning for BERT Models
Priyam Basu, Tiasa Singha Roy, Rakshit Naidu, Zumrut Muftuoglu, Sahib, Singh, Fatemehsadat Mireshghallah

TL;DR
This paper evaluates how differential privacy impacts the training of BERT-based NLP models in centralized and federated settings, providing insights for privacy-preserving healthcare applications.
Contribution
It offers a comprehensive benchmarking of differential privacy effects on various BERT models in federated and centralized setups, with practical guidelines for privacy-utility trade-offs.
Findings
Differential privacy reduces model utility but can be balanced with privacy parameters.
Federated learning with DP shows promising privacy-utility trade-offs.
Open-source implementation facilitates future healthcare NLP research.
Abstract
Natural Language Processing (NLP) techniques can be applied to help with the diagnosis of medical conditions such as depression, using a collection of a person's utterances. Depression is a serious medical illness that can have adverse effects on how one feels, thinks, and acts, which can lead to emotional and physical problems. Due to the sensitive nature of such data, privacy measures need to be taken for handling and training models with such data. In this work, we study the effects that the application of Differential Privacy (DP) has, in both a centralized and a Federated Learning (FL) setup, on training contextualized language models (BERT, ALBERT, RoBERTa and DistilBERT). We offer insights on how to privately train NLP models and what architectures and setups provide more desirable privacy utility trade-offs. We envisage this work to be used in future healthcare and mental health…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Machine Learning in Healthcare · Mental Health via Writing
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · WordPiece · Adam · Dropout · Layer Normalization · LAMB
