Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation
Jen-tse Huang, Chang Chen, Shiyang Lai, Wenxuan Wang, Michelle R. Kaufman, Mark Dredze

TL;DR
This paper evaluates the robustness of Multimodal Large Language Models against misinformation in Chinese short videos, focusing on cognitive biases and deceptive patterns across various health-related domains.
Contribution
It introduces a new annotated dataset and an evaluation framework for assessing MLLMs' ability to detect misinformation and cognitive biases in short videos.
Findings
Gemini-2.5-Pro achieves the highest belief score of 71.5/100.
O3 performs the worst with a score of 35.2.
Models are susceptible to biases like authoritative channel IDs.
Abstract
Short-video platforms have become major channels for misinformation, where deceptive claims frequently leverage visual experiments and social cues. While Multimodal Large Language Models (MLLMs) have demonstrated impressive reasoning capabilities, their robustness against misinformation entangled with cognitive biases remains under-explored. In this paper, we introduce a comprehensive evaluation framework using a high-quality, manually annotated dataset of 200 short videos spanning four health domains. This dataset provides fine-grained annotations for three deceptive patterns-experimental errors, logical fallacies, and fabricated claims-each verified by evidence such as national standards and academic literature. We evaluate eight frontier MLLMs across five modality settings. Experimental results demonstrate that Gemini-2.5-Pro achieves the highest performance in the multimodal setting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
