FaVIQ: FAct Verification from Information-seeking Questions

Jungsoo Park; Sewon Min; Jaewoo Kang; Luke Zettlemoyer; Hannaneh; Hajishirzi

arXiv:2107.02153·cs.CL·March 16, 2022

FaVIQ: FAct Verification from Information-seeking Questions

Jungsoo Park, Sewon Min, Jaewoo Kang, Luke Zettlemoyer, Hannaneh, Hajishirzi

PDF

Open Access 2 Repos

TL;DR

The paper introduces FAVIQ, a large-scale fact verification dataset based on ambiguous information-seeking questions, designed to challenge models and improve professional fact-checking capabilities.

Contribution

It presents a novel dataset constructed from real-world ambiguous questions, reducing bias and requiring comprehensive evidence understanding for verification.

Findings

01

State-of-the-art models perform poorly on FAVIQ.

02

Training on FAVIQ improves fact-checking accuracy by up to 17%.

03

FAVIQ is more natural and less biased than previous datasets.

Abstract

Despite significant interest in developing general purpose fact checking models, it is challenging to construct a large-scale fact verification dataset with realistic real-world claims. Existing claims are either authored by crowdworkers, thereby introducing subtle biases that are difficult to control for, or manually verified by professional fact checkers, causing them to be expensive and limited in scale. In this paper, we construct a large-scale challenging fact verification dataset called FAVIQ, consisting of 188k claims derived from an existing corpus of ambiguous information-seeking questions. The ambiguities in the questions enable automatically constructing true and false claims that reflect user confusions (e.g., the year of the movie being filmed vs. being released). Claims in FAVIQ are verified to be natural, contain little lexical bias, and require a complete understanding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Software Engineering Research · Natural Language Processing Techniques