ContextGuard-LVLM: Enhancing News Veracity through Fine-grained Cross-modal Contextual Consistency Verification

Sihan Ma; Qiming Wu; Ruotong Jiang; Frank Burns

arXiv:2508.06623·cs.CV·August 12, 2025

ContextGuard-LVLM: Enhancing News Veracity through Fine-grained Cross-modal Contextual Consistency Verification

Sihan Ma, Qiming Wu, Ruotong Jiang, Frank Burns

PDF

Open Access

TL;DR

This paper introduces ContextGuard-LVLM, a novel framework leveraging advanced vision-language models and multi-stage reasoning to improve fine-grained cross-modal contextual verification in news content, outperforming existing zero-shot baselines.

Contribution

It presents a new model and dataset enhancements for detecting subtle visual-textual inconsistencies, advancing the state-of-the-art in news veracity verification.

Findings

01

Outperforms zero-shot LVLM baselines in fine-grained tasks

02

Shows robustness to subtle perturbations

03

Aligns better with human judgments

Abstract

The proliferation of digital news media necessitates robust methods for verifying content veracity, particularly regarding the consistency between visual and textual information. Traditional approaches often fall short in addressing the fine-grained cross-modal contextual consistency (FCCC) problem, which encompasses deeper alignment of visual narrative, emotional tone, and background information with text, beyond mere entity matching. To address this, we propose ContextGuard-LVLM, a novel framework built upon advanced Vision-Language Large Models (LVLMs) and integrating a multi-stage contextual reasoning mechanism. Our model is uniquely enhanced through reinforced or adversarial learning paradigms, enabling it to detect subtle contextual misalignments that evade zero-shot baselines. We extend and augment three established datasets (TamperedNews-Ent, News400-Ent, MMG-Ent) with new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Sentiment Analysis and Opinion Mining