Fair NLP Models with Differentially Private Text Encoders
Gaurav Maheshwari, Pascal Denis, Mikaela Keller, Aur\'elien Bellet

TL;DR
This paper introduces FEDERATE, a method combining differential privacy and adversarial training to create private text representations that enhance fairness in NLP models, demonstrating improved privacy, fairness, and accuracy trade-offs.
Contribution
FEDERATE is a novel approach that integrates differential privacy with adversarial training to produce fairer and more private text encodings in NLP.
Findings
FEDERATE outperforms previous methods in privacy and fairness metrics.
The approach maintains high accuracy while improving fairness.
Privacy and fairness benefits reinforce each other in the results.
Abstract
Encoded text representations often capture sensitive attributes about individuals (e.g., race or gender), which raise privacy concerns and can make downstream models unfair to certain groups. In this work, we propose FEDERATE, an approach that combines ideas from differential privacy and adversarial training to learn private text representations which also induces fairer models. We empirically evaluate the trade-off between the privacy of the representations and the fairness and accuracy of the downstream model on four NLP datasets. Our results show that FEDERATE consistently improves upon previous methods, and thus suggest that privacy and fairness can positively reinforce each other.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Ethics and Social Impacts of AI
