A Psycholinguistic Evaluation of Language Models' Sensitivity to Argument Roles
Eun-Kyoung Rosa Lee, Sathvik Nair, Naomi Feldman

TL;DR
This study systematically evaluates large language models' ability to understand argument roles in sentences, revealing they can distinguish plausible from implausible verb contexts but do not mimic human real-time processing patterns.
Contribution
It replicates psycholinguistic experiments to assess language models' sensitivity to argument roles and compares their processing to human sentence comprehension.
Findings
Models distinguish plausible and implausible verb contexts.
Models do not replicate human real-time processing patterns.
Capacity to detect verb plausibility differs from human mechanisms.
Abstract
We present a systematic evaluation of large language models' sensitivity to argument roles, i.e., who did what to whom, by replicating psycholinguistic studies on human argument role processing. In three experiments, we find that language models are able to distinguish verbs that appear in plausible and implausible contexts, where plausibility is determined through the relation between the verb and its preceding arguments. However, none of the models capture the same selective patterns that human comprehenders exhibit during real-time verb prediction. This indicates that language models' capacity to detect verb plausibility does not arise from the same mechanism that underlies human real-time sentence processing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsLanguage, Metaphor, and Cognition · Neurobiology of Language and Bilingualism · Natural Language Processing Techniques
