Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences
Patrick Haller, Lena S. Bolliger, Lena A. J\"ager

TL;DR
This study investigates how language models' predictability measures relate to individual differences in reading, revealing that models tend to emulate readers with lower verbal intelligence and that incorporating cognitive scores improves prediction accuracy.
Contribution
The paper introduces a method to incorporate individual cognitive capacities into language model predictions of reading times, highlighting biases in models' emulation of different cognitive profiles.
Findings
Incorporating cognitive scores improves prediction accuracy.
Models tend to emulate readers with lower verbal intelligence.
High verbal intelligence correlates with lower sensitivity to predictability effects.
Abstract
To date, most investigations on surprisal and entropy effects in reading have been conducted on the group level, disregarding individual differences. In this work, we revisit the predictive power of surprisal and entropy measures estimated from a range of language models (LMs) on data of human reading times as a measure of processing effort by incorporating information of language users' cognitive capacities. To do so, we assess the predictive power of surprisal and entropy estimated from generative LMs on reading data obtained from individuals who also completed a wide range of psychometric tests. Specifically, we investigate if modulating surprisal and entropy relative to cognitive scores increases prediction accuracy of reading times, and we examine whether LMs exhibit systematic biases in the prediction of reading times for cognitively high- or low-performing groups, revealing what…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques
