MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
Shengkang Wang, Hongzhan Lin, Ziyang Luo, Zhen Ye, Guang Chen, Jing Ma

TL;DR
MFC-Bench is a comprehensive benchmark for evaluating the factual accuracy of large vision-language models across manipulation, out-of-context, and veracity classification tasks, highlighting current models' limitations in multimodal fact-checking.
Contribution
The paper introduces MFC-Bench, a new benchmark for assessing the factual correctness of LVLMs, filling a gap in trustworthy AI evaluation tools.
Findings
Current LVLMs perform poorly in multimodal fact-checking.
Models show insensitivity to manipulated content.
Benchmark results reveal significant room for improvement.
Abstract
Large vision-language models (LVLMs) have significantly improved multimodal reasoning tasks, such as visual question answering and image captioning. These models embed multimodal facts within their parameters, rather than relying on external knowledge bases to store factual information explicitly. However, the content discerned by LVLMs may deviate from factuality due to inherent bias or incorrect inference. To address this issue, we introduce MFC-Bench, a rigorous and comprehensive benchmark designed to evaluate the factual accuracy of LVLMs across three stages of verdict prediction for MFC: Manipulation, Out-of-Context, and Veracity Classification. Through our evaluation on MFC-Bench, we benchmarked a dozen diverse and representative LVLMs, uncovering that current models still fall short in multimodal fact-checking and demonstrate insensitivity to various forms of manipulated content.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Misinformation and Its Impacts
MethodsSoftmax · Attention Is All You Need
