An Empirical Survey of the Effectiveness of Debiasing Techniques for   Pre-trained Language Models

Nicholas Meade; Elinor Poole-Dayan; Siva Reddy

arXiv:2110.08527·cs.CL·April 5, 2022·5 cites

An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models

Nicholas Meade, Elinor Poole-Dayan, Siva Reddy

PDF

Open Access 4 Repos 1 Models 3 Datasets

TL;DR

This paper empirically evaluates five bias mitigation techniques for pre-trained language models, analyzing their effectiveness across bias benchmarks and their impact on language modeling and downstream tasks.

Contribution

It provides a comprehensive comparison of recent debiasing methods, highlighting the most effective technique and discussing trade-offs between bias reduction and model performance.

Findings

01

Self-Debias outperforms other techniques on bias benchmarks

02

Debiasing methods are less effective for non-gender biases

03

Bias mitigation often reduces language modeling ability

Abstract

Recent work has shown pre-trained language models capture social biases from the large amounts of text they are trained on. This has attracted attention to developing techniques that mitigate such biases. In this work, we perform an empirical survey of five recently proposed bias mitigation techniques: Counterfactual Data Augmentation (CDA), Dropout, Iterative Nullspace Projection, Self-Debias, and SentenceDebias. We quantify the effectiveness of each technique using three intrinsic bias benchmarks while also measuring the impact of these techniques on a model's language modeling ability, as well as its performance on downstream NLU tasks. We experimentally find that: (1) Self-Debias is the strongest debiasing technique, obtaining improved scores on all bias benchmarks; (2) Current debiasing techniques perform less consistently when mitigating non-gender biases; And (3) improvements on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
danj0nes/dropout_gpt2
model· 13 dl
13 dl

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsDropout