Syntactic Surprisal From Neural Models Predicts, But Underestimates,   Human Processing Difficulty From Syntactic Ambiguities

Suhas Arehalli; Brian Dillon; Tal Linzen

arXiv:2210.12187·cs.CL·August 3, 2023·6 cites

Syntactic Surprisal From Neural Models Predicts, But Underestimates, Human Processing Difficulty From Syntactic Ambiguities

Suhas Arehalli, Brian Dillon, Tal Linzen

PDF

Open Access 1 Repo

TL;DR

This study investigates whether neural language models' syntactic surprisal estimates can predict human processing difficulty in garden path sentences, finding that models underestimate human effects even when syntactic factors are separately weighted.

Contribution

The paper introduces a method to estimate syntactic predictability independently from lexical predictability in language models, revealing underestimation of human garden path effects.

Findings

01

Syntactic surprisal estimates increase when syntactic factors are weighted separately.

02

Neural models still underestimate human garden path effects after independent weighting.

03

Predictability alone does not fully account for human processing difficulty.

Abstract

Humans exhibit garden path effects: When reading sentences that are temporarily structurally ambiguous, they slow down when the structure is disambiguated in favor of the less preferred alternative. Surprisal theory (Hale, 2001; Levy, 2008), a prominent explanation of this finding, proposes that these slowdowns are due to the unpredictability of each of the words that occur in these sentences. Challenging this hypothesis, van Schijndel & Linzen (2021) find that estimates of the cost of word predictability derived from language models severely underestimate the magnitude of human garden path effects. In this work, we consider whether this underestimation is due to the fact that humans weight syntactic factors in their predictions more highly than language models do. We propose a method for estimating syntactic predictability from a language model, allowing us to weigh the cost of lexical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sarehalli/syntacticsurprisal
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeurobiology of Language and Bilingualism · Reading and Literacy Development · Text Readability and Simplification