Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation
Yerin Hwang, Dongryeol Lee, Kyungmin Min, Taegwan Kang, Yong-il Kim, and Kyomin Jung

TL;DR
This paper investigates the vulnerability of LVLM-based evaluators to adversarial visual manipulations, revealing widespread biases that inflate scores and persist despite mitigation efforts, thus questioning their reliability.
Contribution
It introduces a novel benchmark, FRAME, to systematically evaluate visual biases in LVLM judges and demonstrates their susceptibility across multiple domains and evaluation methods.
Findings
LVLM judges are consistently fooled by manipulated images.
Combining multiple biases amplifies score inflation.
Visual biases persist even with prompt-based mitigation.
Abstract
Recently, large vision-language models (LVLMs) have emerged as the preferred tools for judging text-image alignment, yet their robustness along the visual modality remains underexplored. This work is the first study to address a key research question: Can adversarial visual manipulations systematically fool LVLM judges into assigning unfairly inflated scores? We define potential image induced biases within the context of T2I evaluation and examine how these biases affect the evaluations of LVLM judges. Moreover, we introduce a novel, fine-grained, multi-domain meta-evaluation benchmark named FRAME, which is deliberately constructed to exhibit diverse score distributions. By introducing the defined biases into the benchmark, we reveal that all tested LVLM judges exhibit vulnerability across all domains, consistently inflating scores for manipulated images. Further analysis reveals that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech and dialogue systems · Semantic Web and Ontologies
