Investigating Gender Bias in BERT

Rishabh Bhardwaj; Navonil Majumder; Soujanya Poria

arXiv:2009.05021·cs.CL·September 11, 2020

Investigating Gender Bias in BERT

Rishabh Bhardwaj, Navonil Majumder, Soujanya Poria

PDF

TL;DR

This paper analyzes gender bias in BERT, demonstrating its influence on downstream tasks and proposing a method to identify and remove gender-specific directions in embeddings to reduce bias effectively.

Contribution

It introduces a novel algorithm to find fine-grained gender directions in BERT's layers, enabling targeted bias mitigation without losing essential information.

Findings

01

Removing gender directions reduces bias in downstream tasks

02

The proposed method preserves task performance while mitigating bias

03

Gender bias is significantly linked to specific directions in BERT embeddings

Abstract

Contextual language models (CLMs) have pushed the NLP benchmarks to a new height. It has become a new norm to utilize CLM provided word embeddings in downstream tasks such as text classification. However, unless addressed, CLMs are prone to learn intrinsic gender-bias in the dataset. As a result, predictions of downstream NLP models can vary noticeably by varying gender words, such as replacing "he" to "she", or even gender-neutral words. In this paper, we focus our analysis on a popular CLM, i.e., BERT. We analyse the gender-bias it induces in five downstream tasks related to emotion and sentiment intensity prediction. For each task, we train a simple regressor utilizing BERT's word embeddings. We then evaluate the gender-bias in regressors using an equity evaluation corpus. Ideally and from the specific design, the models should discard gender informative features from the input.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Layer Normalization · Weight Decay · Dropout · Linear Warmup With Linear Decay · Dense Connections · WordPiece · Attention Dropout · Attention Is All You Need · Multi-Head Attention