Probabilistic Predictions of People Perusing: Evaluating Metrics of   Language Model Performance for Psycholinguistic Modeling

Yiding Hao; Simon Mendelsohn; Rachel Sterneck; Randi Martinez; Robert; Frank

arXiv:2009.03954·cs.CL·June 25, 2021

Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling

Yiding Hao, Simon Mendelsohn, Rachel Sterneck, Randi Martinez, Robert, Frank

PDF

Open Access

TL;DR

This paper critically evaluates the relationship between language model perplexity and psycholinguistic reading time predictions, proposing a new metric that better correlates model quality with human reading data.

Contribution

It challenges previous claims about perplexity's linear relation to reading times and introduces predictability norm correlation as a more robust performance measure.

Findings

01

Perplexity does not reliably predict reading times across modern models.

02

Predictability norm correlation better correlates with psycholinguistic data.

03

The new metric enables comparison of models with different training setups.

Abstract

By positing a relationship between naturalistic reading times and information-theoretic surprisal, surprisal theory (Hale, 2001; Levy, 2008) provides a natural interface between language models and psycholinguistic models. This paper re-evaluates a claim due to Goodkind and Bicknell (2018) that a language model's ability to model reading times is a linear function of its perplexity. By extending Goodkind and Bicknell's analysis to modern neural architectures, we show that the proposed relation does not always hold for Long Short-Term Memory networks, Transformers, and pre-trained models. We introduce an alternate measure of language modeling performance called predictability norm correlation based on Cloze probabilities measured from human subjects. Our new metric yields a more robust relationship between language model quality and psycholinguistic modeling performance that allows for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification