On the Role of Context in Reading Time Prediction
Andreas Opedal, Eleanor Chodroff, Ryan Cotterell, Ethan Gotlieb Wilcox

TL;DR
This paper investigates how context influences reading time predictions, revealing that traditional measures like surprisal and PMI are heavily confounded with frequency, and proposing a new orthogonalized predictor that reduces apparent contextual effects.
Contribution
The study introduces a novel orthogonalization technique to separate context effects from frequency, challenging previous assumptions about the role of context in reading time prediction.
Findings
Orthogonalized predictor explains less variance in reading times.
Surprisal and PMI are correlated with frequency, not just context.
Previous studies may have overstated the importance of context.
Abstract
We present a new perspective on how readers integrate context during real-time language comprehension. Our proposals build on surprisal theory, which posits that the processing effort of a linguistic unit (e.g., a word) is an affine function of its in-context information content. We first observe that surprisal is only one out of many potential ways that a contextual predictor can be derived from a language model. Another one is the pointwise mutual information (PMI) between a unit and its context, which turns out to yield the same predictive power as surprisal when controlling for unigram frequency. Moreover, both PMI and surprisal are correlated with frequency. This means that neither PMI nor surprisal contains information about context alone. In response to this, we propose a technique where we project surprisal onto the orthogonal complement of frequency, yielding a new contextual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Time Series Analysis and Forecasting
