An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models
Zhongbin Xie, Thomas Lukasiewicz

TL;DR
This paper evaluates parameter-efficient bias mitigation methods for large pretrained language models, showing their effectiveness varies by bias type and model, with adapter tuning often being most effective.
Contribution
It provides a comprehensive empirical comparison of prefix, prompt, and adapter tuning combined with counterfactual data augmentation for debiasing.
Findings
Adapter tuning is most effective for gender bias.
Prompt tuning works better for GPT-2 than BERT.
Parameter-efficient methods can match or outperform full fine-tuning.
Abstract
The increasingly large size of modern pretrained language models not only makes them inherit more human-like biases from the training corpora, but also makes it computationally expensive to mitigate such biases. In this paper, we investigate recent parameter-efficient methods in combination with counterfactual data augmentation (CDA) for bias mitigation. We conduct extensive experiments with prefix tuning, prompt tuning, and adapter tuning on different language models and bias types to evaluate their debiasing performance and abilities to preserve the internal knowledge of a pre-trained model. We find that the parameter-efficient methods (i) are effective in mitigating gender bias, where adapter tuning is consistently the most effective one and prompt tuning is more suitable for GPT-2 than BERT, (ii) are less effective when it comes to racial and religious bias, which may be attributed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Cosine Annealing · Linear Layer · Adam · Attention Dropout · Linear Warmup With Linear Decay · Layer Normalization · Byte Pair Encoding
