A Targeted Assessment of Incremental Processing in Neural LanguageModels   and Humans

Ethan Gotlieb Wilcox; Pranali Vani; Roger P. Levy

arXiv:2106.03232·cs.CL·October 26, 2023

A Targeted Assessment of Incremental Processing in Neural LanguageModels and Humans

Ethan Gotlieb Wilcox, Pranali Vani, Roger P. Levy

PDF

1 Repo

TL;DR

This study compares incremental language processing in humans and neural models using reaction time data across various syntactic phenomena, revealing models' limitations in matching human sensitivity to syntactic violations.

Contribution

It introduces a novel online reaction time paradigm and provides a large-scale comparison showing models' underestimation of processing difficulty differences.

Findings

01

Models match humans in direction of difficulty but underestimate magnitude differences.

02

Models fail to predict longer reaction times in syntactic violation cases.

03

Humans and models show similar increased difficulty in ungrammatical regions.

Abstract

We present a targeted, scaled-up comparison of incremental processing in humans and neural language models by collecting by-word reaction time data for sixteen different syntactic test suites across a range of structural phenomena. Human reaction time data comes from a novel online experimental paradigm called the Interpolated Maze task. We compare human reaction times to by-word probabilities for four contemporary language models, with different architectures and trained on a range of data set sizes. We find that across many phenomena, both humans and language models show increased processing difficulty in ungrammatical sentence regions with human and model `accuracy' scores (a la Marvin and Linzen(2018)) about equal. However, although language model outputs match humans in direction, we show that models systematically under-predict the difference in magnitude of incremental processing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wilcoxeg/targeted-assessment-imaze
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.