DBR: Divergence-Based Regularization for Debiasing Natural Language   Understanding Models

Zihao Li; Ruixiang Tang; Lu Cheng; Shuaiqiang Wang; Dawei Yin; Mengnan; Du

arXiv:2502.18353·cs.CL·February 26, 2025

DBR: Divergence-Based Regularization for Debiasing Natural Language Understanding Models

Zihao Li, Ruixiang Tang, Lu Cheng, Shuaiqiang Wang, Dawei Yin, Mengnan, Du

PDF

Open Access

TL;DR

This paper introduces Divergence Based Regularization (DBR), a novel method to reduce shortcut learning in pre-trained language models, thereby improving their out-of-domain generalization in natural language understanding tasks.

Contribution

The paper proposes a new regularization technique that measures divergence between original and masked examples to mitigate shortcut reliance in PLMs.

Findings

01

Improves out-of-domain performance on NLU tasks

02

Reduces reliance on superficial shortcut features

03

Maintains in-domain accuracy

Abstract

Pre-trained language models (PLMs) have achieved impressive results on various natural language processing tasks. However, recent research has revealed that these models often rely on superficial features and shortcuts instead of developing a genuine understanding of language, especially for natural language understanding (NLU) tasks. Consequently, the models struggle to generalize to out-of-domain data. In this work, we propose Divergence Based Regularization (DBR) to mitigate this shortcut learning behavior. Our method measures the divergence between the output distributions for original examples and examples where shortcut tokens have been masked. This process prevents the model's predictions from being overly influenced by shortcut features or biases. We evaluate our model on three NLU tasks and find that it improves out-of-domain performance with little loss of in-domain accuracy.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis