RW-Post: Auditable Evidence-Grounded Multimodal Fact-Checking in the Wild

Danni Xu; Shaojing Fan; Harry Cheng; Mohan Kankanhalli

arXiv:2512.22933·cs.AI·May 13, 2026

RW-Post: Auditable Evidence-Grounded Multimodal Fact-Checking in the Wild

Danni Xu, Shaojing Fan, Harry Cheng, Mohan Kankanhalli

PDF

TL;DR

RW-Post is a new multimodal fact-checking benchmark with auditable annotations, enabling systematic evaluation of models' ability to ground evidence and verify visual and textual information in social media posts.

Contribution

It introduces RW-Post, a comprehensive benchmark with human-verified evidence linking social media posts to fact-checking articles, supporting controlled evaluation regimes.

Findings

01

Current models have significant room for improvement in evidence grounding.

02

Evidence-bounded evaluation enhances accuracy and faithfulness of models.

03

RW-Post enables systematic diagnosis of visual grounding and evidence utilization.

Abstract

Multimodal misinformation increasingly leverages visual persuasion, where repurposed or manipulated images strengthen misleading text. We introduce RW-Post, a post-aligned text--image benchmark for real-world multimodal fact-checking with auditable annotations: each instance links the original social-media post with reasoning traces and explicitly linked evidence items derived from human fact-check articles via an LLM-assisted extraction-and-auditing pipeline. RW-Post supports controlled evaluation across closed-book, evidence-bounded, and open-web regimes, enabling systematic diagnosis of visual grounding and evidence utilization. We provide AgentFact as a reference verification baseline and benchmark strong open-source LVLMs under unified protocols. Experiments show substantial headroom: current models struggle with faithful evidence grounding, while evidence-bounded evaluation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.