LiveFact: A Dynamic, Time-Aware Benchmark for LLM-Driven Fake News Detection
Cheng Xu, Changhong Jin, Yingjie Niu, Nan Yan, Yuke Mei, Shuhao Guan, Liming Chen, M-Tahar Kechadi

TL;DR
LiveFact introduces a dynamic, time-aware benchmark for evaluating LLMs in fake news detection, emphasizing reasoning with evolving information and reducing data contamination risks.
Contribution
This paper presents LiveFact, a continuously updated benchmark with dual-mode evaluation to better assess models' reasoning under temporal uncertainty.
Findings
Open-source Mixture-of-Experts models outperform some proprietary systems.
Models demonstrate epistemic humility by recognizing unverifiable claims.
LiveFact effectively monitors benchmark data contamination.
Abstract
The rapid development of Large Language Models (LLMs) has transformed fake news detection and fact-checking tasks from simple classification to complex reasoning. However, evaluation frameworks have not kept pace. Current benchmarks are static, making them vulnerable to benchmark data contamination (BDC) and ineffective at assessing reasoning under temporal uncertainty. To address this, we introduce LiveFact a continuously updated benchmark that simulates the real-world "fog of war" in misinformation detection. LiveFact uses dynamic, temporal evidence sets to evaluate models on their ability to reason with evolving, incomplete information rather than on memorized knowledge. We propose a dual-mode evaluation: Classification Mode for final verification and Inference Mode for evidence-based reasoning, along with a component to monitor BDC explicitly. Tests with 22 LLMs show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
