Reverse-Engineering the Reader

Samuel Kiegeland; Ethan Gotlieb Wilcox; Afra Amini; David Robert; Reich; Ryan Cotterell

arXiv:2410.13086·cs.CL·October 18, 2024

Reverse-Engineering the Reader

Samuel Kiegeland, Ethan Gotlieb Wilcox, Afra Amini, David Robert, Reich, Ryan Cotterell

PDF

Open Access 1 Repo

TL;DR

This paper explores directly aligning language models with human reading time data to enhance their use as cognitive models, revealing a trade-off between psychometric accuracy and NLP performance.

Contribution

Introduces a novel fine-tuning method that aligns language models with human psychometric data by optimizing parameters to predict reading times, demonstrating improved cognitive modeling.

Findings

01

Enhanced psychometric predictive power of language models.

02

Inverse relationship between cognitive alignment and NLP task performance.

03

First demonstration of manipulating model alignment to psychometric data affects downstream performance.

Abstract

Numerous previous studies have sought to determine to what extent language models, pretrained on natural language text, can serve as useful models of human cognition. In this paper, we are interested in the opposite question: whether we can directly optimize a language model to be a useful cognitive model by aligning it to human psychometric data. To achieve this, we introduce a novel alignment technique in which we fine-tune a language model to implicitly optimize the parameters of a linear regressor that directly predicts humans' reading times of in-context linguistic units, e.g., phonemes, morphemes, or words, using surprisal estimates derived from the language model. Using words as a test case, we evaluate our technique across multiple model sizes and datasets and find that it improves language models' psychometric predictive power. However, we find an inverse relationship between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

samuki/reverse-engineering-the-reader
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Storytelling and Education