To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times

Thomas Hikaru Clark; Carlos Arriaga; Javier Conde; Gonzalo Mart\'inez; Pedro Reviriego

arXiv:2603.12105·cs.CL·March 13, 2026

To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times

Thomas Hikaru Clark, Carlos Arriaga, Javier Conde, Gonzalo Mart\'inez, Pedro Reviriego

PDF

Open Access

TL;DR

This study explores how large language models can estimate sentence-level psycholinguistic norms like memorability and reading times, showing that fine-tuning improves their correlation with human data, but zero-shot performance remains inconsistent.

Contribution

The paper extends LLM-based psycholinguistic norm estimation to sentence-level features, demonstrating the benefits of fine-tuning for better alignment with human judgments.

Findings

01

Fine-tuning improves LLM estimates of sentence memorability and reading times.

02

Zero-shot and few-shot performances are inconsistent and require careful application.

03

Fine-tuned models outperform baseline predictors in predicting human-derived norms.

Abstract

Large Language Models (LLMs) have recently been shown to produce estimates of psycholinguistic norms, such as valence, arousal, or concreteness, for words and multiword expressions, that correlate with human judgments. These estimates are obtained by prompting an LLM, in zero-shot fashion, with a question similar to those used in human studies. Meanwhile, for other norms such as lexical decision time or age of acquisition, LLMs require supervised fine-tuning to obtain results that align with ground-truth values. In this paper, we extend this approach to the previously unstudied features of sentence memorability and reading times, which involve the relationship between multiple words in a sentence-level context. Our results show that via fine-tuning, models can provide estimates that correlate with human-derived norms and exceed the predictive power of interpretable baseline predictors,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeurobiology of Language and Bilingualism · Mental Health via Writing · Text Readability and Simplification