SycoPhantasy: Quantifying Sycophancy and Hallucination in Small Open Weight VLMs for Vision-Language Scoring of Fantasy Characters
Arya Shah, Deepali Mishra, Chaklam Silpasuwanchai

TL;DR
This paper investigates the tendency of small open-weight vision-language models to give high scores without proper visual grounding, introducing a metric to quantify this behavior and analyzing model size effects.
Contribution
It introduces the Bluffing Coefficient to measure sycophantic evaluation and demonstrates that smaller models are more prone to unjustified high scores compared to larger ones.
Findings
Smaller models exhibit higher sycophancy rates.
The smallest model had a 22.3% sycophantic evaluation rate.
Larger models show significantly lower sycophancy, with 6.0% for the biggest.
Abstract
Vision-language models (VLMs) are increasingly deployed as evaluators in tasks requiring nuanced image understanding, yet their reliability in scoring alignment between images and text descriptions remains underexplored. We investigate whether small, open-weight VLMs exhibit \emph{sycophantic} behavior when evaluating image-text alignment: assigning high scores without grounding their judgments in visual evidence. To quantify this phenomenon, we introduce the \emph{Bluffing Coefficient} (\bc), a metric that measures the mismatch between a model's score and its evidence recall. We evaluate six open-weight VLMs ranging from 450M to 8B parameters on a benchmark of 173,810 AI-generated character portraits paired with detailed textual descriptions. Our analysis reveals a significant inverse correlation between model size and sycophancy rate (, ), with smaller models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
