MFC-Bench: Benchmarking Multimodal Fact-Checking with Large   Vision-Language Models

Shengkang Wang; Hongzhan Lin; Ziyang Luo; Zhen Ye; Guang Chen; Jing Ma

arXiv:2406.11288·cs.CL·February 18, 2025

MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models

Shengkang Wang, Hongzhan Lin, Ziyang Luo, Zhen Ye, Guang Chen, Jing Ma

PDF

Open Access 1 Repo

TL;DR

MFC-Bench is a comprehensive benchmark for evaluating the factual accuracy of large vision-language models across manipulation, out-of-context, and veracity classification tasks, highlighting current models' limitations in multimodal fact-checking.

Contribution

The paper introduces MFC-Bench, a new benchmark for assessing the factual correctness of LVLMs, filling a gap in trustworthy AI evaluation tools.

Findings

01

Current LVLMs perform poorly in multimodal fact-checking.

02

Models show insensitivity to manipulated content.

03

Benchmark results reveal significant room for improvement.

Abstract

Large vision-language models (LVLMs) have significantly improved multimodal reasoning tasks, such as visual question answering and image captioning. These models embed multimodal facts within their parameters, rather than relying on external knowledge bases to store factual information explicitly. However, the content discerned by LVLMs may deviate from factuality due to inherent bias or incorrect inference. To address this issue, we introduce MFC-Bench, a rigorous and comprehensive benchmark designed to evaluate the factual accuracy of LVLMs across three stages of verdict prediction for MFC: Manipulation, Out-of-Context, and Veracity Classification. Through our evaluation on MFC-Bench, we benchmarked a dozen diverse and representative LVLMs, uncovering that current models still fall short in multimodal fact-checking and demonstrate insensitivity to various forms of manipulated content.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wskbest/mfc-bench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Misinformation and Its Impacts

MethodsSoftmax · Attention Is All You Need