Don't Look at the Numbers: Visual Anchoring Bias and Layer-wise Representation in VLMs
M. Shalankin

TL;DR
This paper investigates how numeric anchors on images bias the quality judgments of various Vision-Language Models, revealing architecture-dependent effects and specific layer dynamics responsible for this bias.
Contribution
It provides a causal explanation of visual anchoring bias in VLMs by analyzing layer-wise representations and architecture-specific fusion mechanisms.
Findings
Numeric anchors significantly bias VLM quality judgments.
Optimal layers for quality prediction are deeper than those saturated with anchor classification.
Architecture-dependent fusion strategies influence the manifestation of anchoring bias.
Abstract
Embedded numeric anchors on images systematically bias Vision-Language Model quality judgments across six VLMs from five architectural families (ANOVA eta^2 = 0.18-0.77, all p < 0.001). Anchor effects are 2.5x larger than severe image quality degradation, confirming bias is not reducible to visual changes. Layer-wise probing reveals consistent dissociation: layers where anchor classification saturates (L12-L34) are suboptimal for quality prediction, with optimal layers deeper (R^2 = 0.69-0.91). Fusion analysis identifies architecture-dependent integration -- instant fusion at L1-L2 in two models versus partial or no fusion in three others. These results establish a causal account of visual anchoring bias, linking behavioral susceptibility to representation dynamics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
