The Effect of Surprisal on Reading Times in Information Seeking and Repeated Reading
Keren Gruteke Klein, Yoav Meiri, Omer Shubi, Yevgeni Berzak

TL;DR
This study investigates how surprisal affects reading times across different language processing regimes using eyetracking data, revealing limitations of current models in capturing human cognitive processing.
Contribution
It demonstrates that standard surprisal estimates predict processing times across regimes, but regime-specific estimates do not improve predictions, highlighting model-human misalignments.
Findings
Standard surprisal predicts reading times across regimes.
Regime-specific surprisal estimates do not enhance prediction accuracy.
Current language models may not accurately reflect human cognitive processing.
Abstract
The effect of surprisal on processing difficulty has been a central topic of investigation in psycholinguistics. Here, we use eyetracking data to examine three language processing regimes that are common in daily life but have not been addressed with respect to this question: information seeking, repeated processing, and the combination of the two. Using standard regime-agnostic surprisal estimates we find that the prediction of surprisal theory regarding the presence of a linear effect of surprisal on processing times, extends to these regimes. However, when using surprisal estimates from regime-specific contexts that match the contexts and tasks given to humans, we find that in information seeking, such estimates do not improve the predictive power of processing times compared to standard surprisals. Further, regime-specific contexts yield near zero surprisal estimates with no…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsEducational Methods and Media Use · Technology-Enhanced Education Studies · Educational Strategies and Epistemologies
