TL;DR
VeriTaS is a novel dynamic, multimodal benchmark for automated fact-checking that adapts over time to prevent data leakage and better evaluate models in real-world scenarios.
Contribution
It introduces the first dynamic, multilingual, multimodal AFC benchmark with automated claim updating and a standardized scoring scheme, addressing limitations of static benchmarks.
Findings
Automated annotations closely match human judgments.
VeriTaS covers 54 languages and includes textual and audiovisual claims.
The benchmark is designed to be leakage-resistant and adaptable over time.
Abstract
The growing scale of online misinformation urgently demands Automated Fact-Checking (AFC). Existing benchmarks for evaluating AFC systems, however, are largely limited in terms of task scope, modalities, domain, language diversity, realism, or coverage of misinformation types. Critically, they are static, thus subject to data leakage as their claims enter the pretraining corpora of LLMs. As a result, benchmark performance no longer reliably reflects the actual ability to verify claims. We introduce Verified Theses and Statements (VeriTaS), the first dynamic benchmark for multimodal AFC, designed to remain robust under ongoing large-scale pretraining of foundation models. VeriTaS currently comprises 25,000 real-world claims from 104 professional fact-checking organizations across 54 languages, covering textual and audiovisual content. Claims are added quarterly via a fully automated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
