# Towards Debiasing Fact Verification Models

**Authors:** Tal Schuster, Darsh J Shah, Yun Jie Serene Yeo, Daniel Filizzola,, Enrico Santus, Regina Barzilay

arXiv: 1908.05267 · 2019-09-04

## TL;DR

This paper reveals that current fact verification models often rely on claim-only cues rather than evidence, and proposes a regularization method to reduce this bias, leading to more reliable evaluation of reasoning abilities.

## Contribution

It identifies biases in the FEVER dataset, introduces an evaluation set to measure true evidence-based reasoning, and proposes a regularization technique to mitigate claim-only bias.

## Key findings

- Claim-only classifiers perform well on FEVER.
- Performance drops on bias-avoiding test set.
- Regularization improves evidence-based verification.

## Abstract

Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any evidence. We create an evaluation set that avoids those idiosyncrasies. The performance of FEVER-trained models significantly drops when evaluated on this test set. Therefore, we introduce a regularization method which alleviates the effect of bias in the training data, obtaining improvements on the newly created test set. This work is a step towards a more sound evaluation of reasoning capabilities in fact verification models.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.05267/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1908.05267/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1908.05267/full.md

---
Source: https://tomesphere.com/paper/1908.05267