Agentic Mixed-Source Multi-Modal Misinformation Detection with Adaptive Test-Time Scaling

Wei Jiang; Tong Chen; Wei Yuan; Quoc Viet Hung Nguyen; Hongzhi Yin

arXiv:2603.02519·cs.MM·March 4, 2026

Agentic Mixed-Source Multi-Modal Misinformation Detection with Adaptive Test-Time Scaling

Wei Jiang, Tong Chen, Wei Yuan, Quoc Viet Hung Nguyen, Hongzhi Yin

PDF

Open Access

TL;DR

This paper introduces AgentM3D, a multi-agent framework that enhances zero-shot multi-modal misinformation detection by employing adaptive reasoning strategies and dynamic path exploration, significantly improving detection accuracy.

Contribution

The work proposes a novel multi-agent system with adaptive test-time scaling and reasoning path exploration to improve zero-shot multi-modal misinformation detection.

Findings

01

Achieves state-of-the-art zero-shot detection performance on M3D benchmarks.

02

Outperforms existing VLM-based and agentic baselines.

03

Demonstrates effectiveness of adaptive reasoning in complex misinformation tasks.

Abstract

Vision-language models (VLMs) have been proven effective for detecting multi-modal misinformation on social platforms, especially in zero-shot settings with unavailable or delayed annotations. However, a single VLM's capacity falls short in the more complex mixed-source multi-modal misinformation detection (M3D) task. Taking captioned images as an example, in M3D, false information can originate from untruthful texts, forged images, or mismatches between the two modalities. Although recent agentic systems can handle zero-shot M3D by connecting modality-specific VLM agents, their effectiveness is still bottlenecked by their architecture. In existing agentic M3D solutions, for any input sample, each agent performs only one forward reasoning pass, making decisions prone to model randomness and reasoning errors in challenging cases. Moreover, the lack of exploration over alternative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Multimodal Machine Learning Applications · Ethics and Social Impacts of AI