TL;DR
This paper introduces two training strategies that incorporate bias-only models to reduce reliance on dataset biases, significantly improving the robustness and transferability of natural language understanding models to out-of-domain datasets.
Contribution
The paper proposes novel bias mitigation techniques that leverage bias-only models during training to enhance model robustness and out-of-domain generalization.
Findings
Debiasing methods improve out-of-domain performance.
Models transfer better to different textual entailment datasets.
Significant robustness gains across multiple benchmarks.
Abstract
Several recent studies have shown that strong natural language understanding (NLU) models are prone to relying on unwanted dataset biases without learning the underlying task, resulting in models that fail to generalize to out-of-domain datasets and are likely to perform poorly in real-world scenarios. We propose two learning strategies to train neural models, which are more robust to such biases and transfer better to out-of-domain datasets. The biases are specified in terms of one or more bias-only models, which learn to leverage the dataset biases. During training, the bias-only models' predictions are used to adjust the loss of the base model to reduce its reliance on biases by down-weighting the biased examples and focusing the training on the hard examples. We experiment on large-scale natural language inference and fact verification benchmarks, evaluating on out-of-domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
