Pixels Versus Priors: Controlling Knowledge Priors in Vision-Language Models through Visual Counterfacts
Michal Golovanevsky, William Rudman, Michael Lepori, Amir Bar, Ritambhara Singh, Carsten Eickhoff

TL;DR
This paper investigates whether multimodal large language models rely more on memorized world knowledge or visual input, introducing a dataset and a control mechanism to steer model predictions between priors and visual evidence.
Contribution
It introduces Visual CounterFact, a dataset of counterfactual images, and Pixels Versus Priors (PvP), a method to control model reliance on priors versus visual input.
Findings
Models initially rely on priors but shift to visual evidence in later layers.
PvP effectively redirects model predictions from priors to visual input.
High success rate in controlling model outputs toward counterfactuals.
Abstract
Multimodal Large Language Models (MLLMs) perform well on tasks such as visual question answering, but it remains unclear whether their reasoning relies more on memorized world knowledge or on the visual information present in the input image. To investigate this, we introduce Visual CounterFact, a new dataset of visually-realistic counterfactuals that put world knowledge priors (e.g, red strawberry) into direct conflict with visual input (e.g, blue strawberry). Using Visual CounterFact, we show that model predictions initially reflect memorized priors, but shift toward visual evidence in mid-to-late layers. This dynamic reveals a competition between the two modalities, with visual input ultimately overriding priors during evaluation. To control this behavior, we propose Pixels Versus Priors (PvP) steering vectors, a mechanism for controlling model outputs toward either world knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications
