FaVIQ: FAct Verification from Information-seeking Questions
Jungsoo Park, Sewon Min, Jaewoo Kang, Luke Zettlemoyer, Hannaneh, Hajishirzi

TL;DR
The paper introduces FAVIQ, a large-scale fact verification dataset based on ambiguous information-seeking questions, designed to challenge models and improve professional fact-checking capabilities.
Contribution
It presents a novel dataset constructed from real-world ambiguous questions, reducing bias and requiring comprehensive evidence understanding for verification.
Findings
State-of-the-art models perform poorly on FAVIQ.
Training on FAVIQ improves fact-checking accuracy by up to 17%.
FAVIQ is more natural and less biased than previous datasets.
Abstract
Despite significant interest in developing general purpose fact checking models, it is challenging to construct a large-scale fact verification dataset with realistic real-world claims. Existing claims are either authored by crowdworkers, thereby introducing subtle biases that are difficult to control for, or manually verified by professional fact checkers, causing them to be expensive and limited in scale. In this paper, we construct a large-scale challenging fact verification dataset called FAVIQ, consisting of 188k claims derived from an existing corpus of ambiguous information-seeking questions. The ambiguities in the questions enable automatically constructing true and false claims that reflect user confusions (e.g., the year of the movie being filmed vs. being released). Claims in FAVIQ are verified to be natural, contain little lexical bias, and require a complete understanding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Software Engineering Research · Natural Language Processing Techniques
