Modeling cognitive processes of natural reading with transformer-based Language Models

Bruno Bianchi; Ferm\'in Travi; Juan E. Kamienkowski

arXiv:2505.11485·cs.CL·May 19, 2025

Modeling cognitive processes of natural reading with transformer-based Language Models

Bruno Bianchi, Ferm\'in Travi, Juan E. Kamienkowski

PDF

Open Access

TL;DR

This paper evaluates transformer-based language models like GPT2 and LLaMA in explaining eye movement patterns during reading, showing they outperform older models but still do not fully replicate human predictability effects.

Contribution

The study extends previous research by assessing transformer models' ability to predict gaze durations, revealing their improved performance yet persistent limitations in modeling human reading behavior.

Findings

01

Transformer models outperform earlier models in explaining gaze duration variance.

02

Models still do not fully account for human predictability effects.

03

State-of-the-art models differ from human language processing.

Abstract

Recent advances in Natural Language Processing (NLP) have led to the development of highly sophisticated language models for text generation. In parallel, neuroscience has increasingly employed these models to explore cognitive processes involved in language comprehension. Previous research has shown that models such as N-grams and LSTM networks can partially account for predictability effects in explaining eye movement behaviors, specifically Gaze Duration, during reading. In this study, we extend these findings by evaluating transformer-based models (GPT2, LLaMA-7B, and LLaMA2-7B) to further investigate this relationship. Our results indicate that these architectures outperform earlier models in explaining the variance in Gaze Durations recorded from Rioplantense Spanish readers. However, similar to previous studies, these models still fail to account for the entirety of the variance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReading and Literacy Development · Neurobiology of Language and Bilingualism · Text Readability and Simplification

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory