Unsupervised Pretraining for Fact Verification by Language Model   Distillation

Adri\'an Bazaga; Pietro Li\`o; Gos Micklem

arXiv:2309.16540·cs.CL·March 8, 2024

Unsupervised Pretraining for Fact Verification by Language Model Distillation

Adri\'an Bazaga, Pietro Li\`o, Gos Micklem

PDF

Open Access 1 Repo

TL;DR

This paper introduces SFAVEL, an unsupervised pretraining framework that uses language model distillation and a contrastive loss to improve fact verification by generating high-quality claim-evidence alignments without annotations.

Contribution

It presents a novel unsupervised approach for fact verification that surpasses previous methods by leveraging language model distillation and a new contrastive loss function.

Findings

01

Achieves state-of-the-art results on FB15k-237 with +5.3% Hits@1

02

Achieves state-of-the-art results on FEVER with +8% accuracy

03

Demonstrates effectiveness of unsupervised pretraining for fact verification

Abstract

Fact verification aims to verify a claim using evidence from a trustworthy knowledge base. To address this challenge, algorithms must produce features for every claim that are both semantically meaningful, and compact enough to find a semantic alignment with the source information. In contrast to previous work, which tackled the alignment problem by learning over annotated corpora of claims and their corresponding labels, we propose SFAVEL (Self-supervised Fact Verification via Language Model Distillation), a novel unsupervised pretraining framework that leverages pre-trained language models to distil self-supervised features into high-quality claim-fact alignments without the need for annotations. This is enabled by a novel contrastive loss function that encourages features to attain high-quality claim and evidence alignments whilst preserving the semantic relationships across the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adrianbzg/sfavel
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsBalanced Selection