Towards Human Cognition: Visual Context Guides Syntactic Priming in Fusion-Encoded Models
Bushi Xiao, Michael Bennie, Jayetri Bardhan, Daisy Zhe Wang

TL;DR
This paper introduces PRISMATIC, a new multimodal dataset and evaluation metric to study syntactic priming in large language models, revealing that fusion-encoded models better mimic human-like syntax-vision interactions.
Contribution
It presents the first dataset and metric for syntactic priming in multimodal models, and compares encoding architectures to understand their cognitive alignment.
Findings
Fusion-encoded models show stronger priming-visual similarity correlation.
Models with different architectures exhibit similar priming effects.
PRISMATIC enables standardized evaluation of syntax-vision interactions.
Abstract
Structural priming is a cognitive phenomenon where exposure to a particular syntactic structure increases the likelihood of producing the same structure in subsequent utterances. While humans consistently demonstrate structural priming effects across various linguistic contexts, it remains unclear whether multimodal large language models (MLLMs) exhibit similar syntactic preservation behaviors. We introduce PRISMATIC, the first multimodal structural priming dataset, which advances computational linguistics by providing a standardized benchmark for investigating syntax-vision interactions. We propose the Syntactic Preservation Index (SPI), a novel reference-free evaluation metric designed specifically to assess structural priming effects in sentence level. Using this metric, we constructed and tested models with two different multimodal encoding architectures to investigate their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications
