Exposing Cross-Modal Consistency for Fake News Detection in Short-Form Videos

Chong Tian; Yu Wang; Chenxu Yang; Junyi Guan; Zheng Lin; Yuhan Liu; Xiuying Chen; Qirong Ho

arXiv:2603.14992·cs.AI·March 17, 2026

Exposing Cross-Modal Consistency for Fake News Detection in Short-Form Videos

Chong Tian, Yu Wang, Chenxu Yang, Junyi Guan, Zheng Lin, Yuhan Liu, Xiuying Chen, Qirong Ho

PDF

Open Access

TL;DR

This paper introduces MAGIC3, a novel fake news detection method for short videos that leverages cross-modal consistency signals across text, visuals, and audio to improve accuracy and efficiency.

Contribution

MAGIC3 is the first model to explicitly model and utilize cross-tri-modal consistency at multiple granularities for fake news detection in short videos.

Findings

01

MAGIC3 outperforms non-VLM baselines on FakeSV and FakeTT datasets.

02

The model achieves VLM-level accuracy with significantly higher throughput and lower VRAM usage.

03

Cross-modal consistency signals effectively distinguish real from fake videos.

Abstract

Short-form video platforms are major channels for news but also fertile ground for multimodal misinformation where each modality appears plausible alone yet cross-modal relationships are subtly inconsistent, like mismatched visuals and captions. On two benchmark datasets, FakeSV (Chinese) and FakeTT (English), we observe a clear asymmetry: real videos exhibit high text-visual but moderate text-audio consistency, while fake videos show the opposite pattern. Moreover, a single global consistency score forms an interpretable axis along which fake probability and prediction errors vary smoothly. Motivated by these observations, we present MAGIC3 (Modal-Adversarial Gated Interaction and Consistency-Centric Classifier), a detector that explicitly models and exposes cross-tri-modal consistency signals at multiple granularities. MAGIC3 combines explicit pairwise and global consistency modeling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis