Benchmarking Differential Privacy and Federated Learning for BERT Models

Priyam Basu; Tiasa Singha Roy; Rakshit Naidu; Zumrut Muftuoglu; Sahib; Singh; Fatemehsadat Mireshghallah

arXiv:2106.13973·cs.CL·June 17, 2022·6 cites

Benchmarking Differential Privacy and Federated Learning for BERT Models

Priyam Basu, Tiasa Singha Roy, Rakshit Naidu, Zumrut Muftuoglu, Sahib, Singh, Fatemehsadat Mireshghallah

PDF

Open Access 1 Repo

TL;DR

This paper evaluates how differential privacy impacts the training of BERT-based NLP models in centralized and federated settings, providing insights for privacy-preserving healthcare applications.

Contribution

It offers a comprehensive benchmarking of differential privacy effects on various BERT models in federated and centralized setups, with practical guidelines for privacy-utility trade-offs.

Findings

01

Differential privacy reduces model utility but can be balanced with privacy parameters.

02

Federated learning with DP shows promising privacy-utility trade-offs.

03

Open-source implementation facilitates future healthcare NLP research.

Abstract

Natural Language Processing (NLP) techniques can be applied to help with the diagnosis of medical conditions such as depression, using a collection of a person's utterances. Depression is a serious medical illness that can have adverse effects on how one feels, thinks, and acts, which can lead to emotional and physical problems. Due to the sensitive nature of such data, privacy measures need to be taken for handling and training models with such data. In this work, we study the effects that the application of Differential Privacy (DP) has, in both a centralized and a Federated Learning (FL) setup, on training contextualized language models (BERT, ALBERT, RoBERTa and DistilBERT). We offer insights on how to privately train NLP models and what architectures and setups provide more desirable privacy utility trade-offs. We envisage this work to be used in future healthcare and mental health…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

whopriyam/Benchmarking-Differential-Privacy-and-Federated-Learning-for-BERT-Models
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Machine Learning in Healthcare · Mental Health via Writing

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · WordPiece · Adam · Dropout · Layer Normalization · LAMB