FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders
Pengyu Cheng, Weituo Hao, Siyang Yuan, Shijing Si, Lawrence Carin

TL;DR
FairFil is a novel neural debiasing method that reduces social bias in pretrained sentence encoders using contrastive learning, without retraining the original models, while maintaining task performance.
Contribution
It introduces the first neural debiasing approach for pretrained sentence encoders that leverages contrastive learning to minimize bias while preserving semantic information.
Findings
Effectively reduces bias in pretrained text encoders.
Maintains high performance on downstream NLP tasks.
Does not require retraining of original models.
Abstract
Pretrained text encoders, such as BERT, have been applied increasingly in various natural language processing (NLP) tasks, and have recently demonstrated significant performance gains. However, recent studies have demonstrated the existence of social bias in these pretrained NLP models. Although prior works have made progress on word-level debiasing, improved sentence-level fairness of pretrained encoders still lacks exploration. In this paper, we proposed the first neural debiasing method for a pretrained sentence encoder, which transforms the pretrained encoder outputs into debiased representations via a fair filter (FairFil) network. To learn the FairFil, we introduce a contrastive learning framework that not only minimizes the correlation between filtered embeddings and bias words but also preserves rich semantic information of the original sentences. On real-world datasets, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Machine Learning and Data Classification
MethodsLinear Layer · Contrastive Learning · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Weight Decay · Multi-Head Attention · Dense Connections · Softmax · Layer Normalization
