MultiCheck: Strengthening Web Trust with Unified Multimodal Fact Verification
Aditya Kishore, Gaurav Kumar, Jasabanta Patro

TL;DR
MultiCheck is an efficient, transparent multimodal fact verification framework that effectively detects misinformation across text, images, and OCR content, suitable for low-resource environments.
Contribution
It introduces a novel relational fusion module and contrastive alignment objective for multimodal fact verification, emphasizing interpretability and robustness.
Findings
Significant performance improvements on Factify-2 and Mocheg benchmarks.
Robustness under noisy OCR and missing modality conditions.
Low computational overhead suitable for resource-constrained deployment.
Abstract
Misinformation on the web increasingly appears in multimodal forms, combining text, images, and OCR-rendered content in ways that amplify harm to public trust and vulnerable communities. While prior fact-checking systems often rely on unimodal signals or shallow fusion strategies, modern misinformation campaigns operate across modalities and require models that can reason over subtle cross-modal inconsistencies in a transparent and responsible manner. We introduce MultiCheck, a lightweight and interpretable framework for multimodal fact verification that jointly analyzes textual, visual, and OCR evidence. At its core, MultiCheck employs a relational fusion module based on element-wise difference and product operations, allowing for explicit cross-modal interaction modeling with minimal computational overhead. A contrastive alignment objective further helps the model distinguish between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques
