From Articles to Premises: Building PrimeFacts, an Extraction Methodology and Resource for Fact-Checking Evidence
Premtim Sahitaj, Jawan Kolanowski, Ariana Sahitaj, Veronika Solopova, Max Upravitelev, Daniel R\"oder, Iffat Maab, Junichi Yamagishi, Sebastian M\"oller, Vera Schmitt

TL;DR
PrimeFacts introduces a methodology and resource for extracting structured, stand-alone evidence from fact-checking articles, significantly enhancing automated claim verification and evidence retrieval.
Contribution
It presents a large-scale dataset and a framework leveraging large language models to extract and rewrite evidence, improving retrieval and verification performance.
Findings
Decontextualized premises improve evidence retrievability by up to 30% in MRR.
Using extracted premises increases claim verification Macro-F1 by 10-20 points.
The approach maintains faithfulness to original sources in qualitative analysis.
Abstract
Fact-checking articles encode rich supporting evidence and reasoning, yet this evidence remains largely inaccessible to automated verification systems due to unstructured presentation. We introduce PrimeFacts, a methodology and resource for extracting fine-grained evidence from full fact-checking articles. We compile 13,106 PolitiFact articles with claims, verdicts, and all referenced sources, and we identify 49,718 in-article hyperlinks as natural anchors to pinpoint key evidence. Our framework leverages large language models (LLMs) to rewrite these anchor sentences into stand-alone, context-independent premises and investigates the extraction of additional implicit evidence. In evaluations on cross-article evidence retrieval and claim verification, the extracted premises substantially improve performance. Decontextualized evidence yields higher retrievability, achieving up to a 30…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
