FairFil: Contrastive Neural Debiasing Method for Pretrained Text   Encoders

Pengyu Cheng; Weituo Hao; Siyang Yuan; Shijing Si; Lawrence Carin

arXiv:2103.06413·cs.CL·March 12, 2021·50 cites

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Pengyu Cheng, Weituo Hao, Siyang Yuan, Shijing Si, Lawrence Carin

PDF

Open Access 1 Video

TL;DR

FairFil is a novel neural debiasing method that reduces social bias in pretrained sentence encoders using contrastive learning, without retraining the original models, while maintaining task performance.

Contribution

It introduces the first neural debiasing approach for pretrained sentence encoders that leverages contrastive learning to minimize bias while preserving semantic information.

Findings

01

Effectively reduces bias in pretrained text encoders.

02

Maintains high performance on downstream NLP tasks.

03

Does not require retraining of original models.

Abstract

Pretrained text encoders, such as BERT, have been applied increasingly in various natural language processing (NLP) tasks, and have recently demonstrated significant performance gains. However, recent studies have demonstrated the existence of social bias in these pretrained NLP models. Although prior works have made progress on word-level debiasing, improved sentence-level fairness of pretrained encoders still lacks exploration. In this paper, we proposed the first neural debiasing method for a pretrained sentence encoder, which transforms the pretrained encoder outputs into debiased representations via a fair filter (FairFil) network. To learn the FairFil, we introduce a contrastive learning framework that not only minimizes the correlation between filtered embeddings and bias words but also preserves rich semantic information of the original sentences. On real-world datasets, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Machine Learning and Data Classification

MethodsLinear Layer · Contrastive Learning · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Weight Decay · Multi-Head Attention · Dense Connections · Softmax · Layer Normalization