Entropy-based Attention Regularization Frees Unintended Bias Mitigation   from Lists

Giuseppe Attanasio; Debora Nozza; Dirk Hovy; Elena Baralis

arXiv:2203.09192·cs.CL·March 18, 2022

Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists

Giuseppe Attanasio, Debora Nozza, Dirk Hovy, Elena Baralis

PDF

1 Repo

TL;DR

This paper introduces an entropy-based attention regularization method that reduces bias and overfitting in NLP models without relying on predefined lists of identity terms, improving fairness and performance.

Contribution

The authors propose a novel, knowledge-free regularization technique that discourages overfitting to specific terms by penalizing low-entropy attention, enhancing bias mitigation in NLP models.

Findings

01

Matches or exceeds state-of-the-art performance on hate speech classification.

02

Effectively reveals overfitting terms contributing to bias.

03

Improves fairness metrics across multiple benchmark datasets.

Abstract

Natural Language Processing (NLP) models risk overfitting to specific terms in the training data, thereby reducing their performance, fairness, and generalizability. E.g., neural hate speech detection models are strongly influenced by identity terms like gay, or women, resulting in false positives, severe unintended bias, and lower performance. Most mitigation techniques use lists of identity terms or samples from the target domain during training. However, this approach requires a-priori knowledge and introduces further bias if important terms are neglected. Instead, we propose a knowledge-free Entropy-based Attention Regularization (EAR) to discourage overfitting to training-specific terms. An additional objective function penalizes tokens with low self-attention entropy. We fine-tune BERT via EAR: the resulting model matches or exceeds state-of-the-art performance for hate speech…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

g8a9/ear
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Dropout · Layer Normalization