MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs

Huiyi Chen; Jiawei Peng; Dehai Min; Changchang Sun; Kaijie Chen; Yan Yan; Xu Yang; Lu Cheng

arXiv:2511.14159·cs.CV·May 20, 2026

MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs

Huiyi Chen, Jiawei Peng, Dehai Min, Changchang Sun, Kaijie Chen, Yan Yan, Xu Yang, Lu Cheng

PDF

1 Repo

TL;DR

MVI-Bench is a new comprehensive benchmark designed to evaluate the robustness of Large Vision-Language Models against misleading visual inputs, addressing a critical gap in existing evaluation methods.

Contribution

We introduce MVI-Bench, the first benchmark focusing on misleading visual inputs in LVLMs, along with a novel sensitivity metric for detailed robustness assessment.

Findings

01

State-of-the-art LVLMs show significant vulnerabilities to misleading visual inputs.

02

MVI-Bench uncovers specific weaknesses at different hierarchical levels of visual misleading cues.

03

The benchmark and code facilitate future development of more robust LVLMs.

Abstract

Evaluating the robustness of Large Vision-Language Models (LVLMs) is essential for their continued development and responsible deployment in real-world applications. However, existing robustness benchmarks typically focus on hallucination or misleading textual inputs, while largely overlooking the equally critical challenge posed by misleading visual inputs in assessing visual understanding. To fill this important gap, we introduce MVI-Bench, the first comprehensive benchmark specially designed for evaluating how Misleading Visual Inputs undermine the robustness of LVLMs. Grounded in fundamental visual primitives, the design of MVI-Bench centers on three hierarchical levels of misleading visual inputs: Visual Concept, Visual Attribute, and Visual Relationship. Using this taxonomy, we curate six representative categories and compile 1,248 expertly annotated VQA instances. To facilitate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chenyil6/MVI-Bench
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)