# On Adversarial Removal of Hypothesis-only Bias in Natural Language   Inference

**Authors:** Yonatan Belinkov, Adam Poliak, Stuart M. Shieber, Benjamin Van Durme,, Alexander M. Rush

arXiv: 1907.04389 · 2019-07-11

## TL;DR

This paper investigates the use of adversarial learning to reduce hypothesis-only biases in Natural Language Inference models, aiming to improve fairness and robustness with minimal impact on accuracy.

## Contribution

It demonstrates that adversarial learning can effectively diminish hypothesis-only biases in NLI models while maintaining most of their predictive performance.

## Key findings

- Adversarial training reduces bias in NLI representations.
- Small accuracy drops observed with bias mitigation.
- Bias reduction does not significantly impair model performance.

## Abstract

Popular Natural Language Inference (NLI) datasets have been shown to be tainted by hypothesis-only biases. Adversarial learning may help models ignore sensitive biases and spurious correlations in data. We evaluate whether adversarial learning can be used in NLI to encourage models to learn representations free of hypothesis-only biases. Our analyses indicate that the representations learned via adversarial learning may be less biased, with only small drops in NLI accuracy.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.04389/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1907.04389/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1907.04389/full.md

---
Source: https://tomesphere.com/paper/1907.04389