REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control
Chuyi Kong, Gao Wei, Jing Ma, Hongzhan Lin, Yuxi Sun

TL;DR
REFLEX is a self-refining fact-checking method that improves explanation faithfulness and accuracy by controlling reasoning style anchored on verdicts, reducing hallucinations and enhancing real-time reliability.
Contribution
It introduces a self-refining paradigm using self-disagreement signals to disentangle fact from style, achieving state-of-the-art results with minimal samples.
Findings
REFLEX achieves state-of-the-art performance on real-world datasets.
It improves transferability, gaining up to 7.54% on in-the-wild data.
The method effectively reduces hallucinations in explanations.
Abstract
The prevalence of fake news on social media demands automated fact-checking systems to provide accurate verdicts with faithful explanations. However, existing large language model (LLM)-based approaches ignore deceptive misinformation styles in LLM-generated explanations, resulting in unfaithful rationales that can mislead human judgments. They rely heavily on external knowledge sources, introducing hallucinations and even high latency that undermine reliability and responsiveness, which is crucial for real-time use. To address these challenges, we propose REason-guided Fact-checking with Latent EXplanations (REFLEX), a self-refining paradigm that explicitly controls reasoning style anchored on verdict. REFLEX utilizes self-disagreement veracity signals between the backbone model and its fine-tuned variant to construct steering vectors, naturally disentangling fact from style.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
