Multimodal Large Language Models to Support Real-World Fact-Checking
Jiahui Geng, Yova Kementchedjhieva, Preslav Nakov, Iryna Gurevych

TL;DR
This paper systematically evaluates multimodal large language models' ability to support real-world fact-checking, revealing GPT-4V's superior performance and highlighting biases in open-source models, with implications for trustworthy misinformation detection.
Contribution
It introduces a novel framework for assessing MLLMs' fact-checking capabilities using intrinsic knowledge and reasoning, the first such evaluation in this context.
Findings
GPT-4V outperforms others in detecting misleading claims
Open-source models show strong biases and prompt sensitivity
Insights for developing secure, trustworthy multimodal models
Abstract
Multimodal large language models (MLLMs) carry the potential to support humans in processing vast amounts of information. While MLLMs are already being used as a fact-checking tool, their abilities and limitations in this regard are understudied. Here is aim to bridge this gap. In particular, we propose a framework for systematically assessing the capacity of current multimodal models to facilitate real-world fact-checking. Our methodology is evidence-free, leveraging only these models' intrinsic knowledge and reasoning capabilities. By designing prompts that extract models' predictions, explanations, and confidence levels, we delve into research questions concerning model accuracy, robustness, and reasons for failure. We empirically find that (1) GPT-4V exhibits superior performance in identifying malicious and misleading multimodal claims, with the ability to explain the unreasonable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Software Engineering Research · Advanced Text Analysis Techniques
