Vectors from Larger Language Models Predict Human Reading Time and fMRI Data More Poorly when Dimensionality Expansion is Controlled

Yi-Chien Lin; Hongao Zhu; William Schuler

arXiv:2505.12196·cs.CL·May 20, 2025

Vectors from Larger Language Models Predict Human Reading Time and fMRI Data More Poorly when Dimensionality Expansion is Controlled

Yi-Chien Lin, Hongao Zhu, William Schuler

PDF

Open Access

TL;DR

This study shows that larger language model vectors predict human reading times and fMRI data less accurately when controlling for vector dimensionality, indicating a misalignment that worsens with model size.

Contribution

It introduces a method to evaluate LLM scaling by controlling for predictor dimensionality, revealing inverse scaling effects in predicting human data.

Findings

01

Inverse scaling observed when controlling for predictor size.

02

Larger LLM vectors predict human data less accurately.

03

Misalignment between LLMs and human sentence processing increases with model size.

Abstract

The impressive linguistic abilities of large language models (LLMs) have recommended them as models of human sentence processing, with some conjecturing a positive 'quality-power' relationship (Wilcox et al., 2023), in which language models' (LMs') fit to psychometric data continues to improve as their ability to predict words in context increases. This is important because it suggests that elements of LLM architecture, such as veridical attention to context and a unique objective of predicting upcoming words, reflect the architecture of the human sentence processing faculty, and that any inadequacies in predicting human reading time and brain imaging data may be attributed to insufficient model complexity, which recedes as larger models become available. Recent studies (Oh and Schuler, 2023) have shown this scaling inverts after a point, as LMs become excessively large and accurate,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeurobiology of Language and Bilingualism · Text Readability and Simplification · Ferroelectric and Negative Capacitance Devices

MethodsSoftmax · Attention Is All You Need