CrossAug: A Contrastive Data Augmentation Method for Debiasing Fact   Verification Models

Minwoo Lee; Seungpil Won; Juae Kim; Hwanhee Lee; Cheoneum Park; Kyomin; Jung

arXiv:2109.15107·cs.CL·October 1, 2021

CrossAug: A Contrastive Data Augmentation Method for Debiasing Fact Verification Models

Minwoo Lee, Seungpil Won, Juae Kim, Hwanhee Lee, Cheoneum Park, Kyomin, Jung

PDF

1 Repo 1 Models

TL;DR

CrossAug is a contrastive data augmentation technique that reduces biases in fact verification models by generating and pairing new claims and evidences, leading to more robust and less biased models especially in low-resource scenarios.

Contribution

The paper introduces CrossAug, a novel two-stage contrastive data augmentation method that effectively debiases fact verification models and improves performance on biased datasets and in low-data settings.

Findings

01

Outperforms previous debiasing methods by 3.6% on FEVER dataset.

02

Achieves a 10.13% overall performance boost from baseline.

03

Effective in low-resource scenarios with only 1% of data.

Abstract

Fact verification datasets are typically constructed using crowdsourcing techniques due to the lack of text sources with veracity labels. However, the crowdsourcing process often produces undesired biases in data that cause models to learn spurious patterns. In this paper, we propose CrossAug, a contrastive data augmentation method for debiasing fact verification models. Specifically, we employ a two-stage augmentation pipeline to generate new claims and evidences from existing samples. The generated samples are then paired cross-wise with the original pair, forming contrastive samples that facilitate the model to rely less on spurious patterns and learn more robust representations. Experimental results show that our method outperforms the previous state-of-the-art debiasing technique by 3.6% on the debiased extension of the FEVER dataset, with a total performance boost of 10.13% from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

minwhoo/crossaug
pytorchOfficial

Models

🤗
minwhoo/bart-base-negative-claim-generation
model· 4 dl· ♡ 6
4 dl♡ 6

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.