TL;DR
This paper introduces a self-debiasing framework for NLU models that reduces reliance on biases without prior knowledge of bias types, enhancing robustness and complementing existing methods.
Contribution
It presents the first general self-debiasing approach that works without knowing specific biases, improving robustness and compatibility with existing debiasing techniques.
Findings
Framework improves model robustness on challenge datasets
Enhances existing debiasing methods without bias-specific targeting
Results in better generalization across diverse biases
Abstract
NLU models often exploit biases to achieve high dataset-specific performance without properly learning the intended task. Recently proposed debiasing methods are shown to be effective in mitigating this tendency. However, these methods rely on a major assumption that the types of bias should be known a-priori, which limits their application to many NLU tasks and datasets. In this work, we present the first step to bridge this gap by introducing a self-debiasing framework that prevents models from mainly utilizing biases without knowing them in advance. The proposed framework is general and complementary to the existing debiasing methods. We show that it allows these existing methods to retain the improvement on the challenge datasets (i.e., sets of examples designed to expose models' reliance on biases) without specifically targeting certain biases. Furthermore, the evaluation suggests…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
