Human Sentence Processing: Recurrence or Attention?

Danny Merkx; Stefan L. Frank

arXiv:2005.09471·cs.CL·March 31, 2022

Human Sentence Processing: Recurrence or Attention?

Danny Merkx, Stefan L. Frank

PDF

1 Repo

TL;DR

This paper compares Transformer and RNN language models to human reading effort, finding Transformers better explain reading times and neural data, challenging traditional views of human sentence processing.

Contribution

It demonstrates that Transformer models outperform RNNs in modeling human reading effort, providing new insights into the mechanisms of sentence processing.

Findings

01

Transformers better explain self-paced reading times.

02

Transformers align more closely with neural activity during reading.

03

Results challenge the idea that human processing is solely recurrent and immediate.

Abstract

Recurrent neural networks (RNNs) have long been an architecture of interest for computational models of human sentence processing. The recently introduced Transformer architecture outperforms RNNs on many natural language processing tasks but little is known about its ability to model human language processing. We compare Transformer- and RNN-based language models' ability to account for measures of human reading effort. Our analysis shows Transformers to outperform RNNs in explaining self-paced reading times and neural activity during reading English sentences, challenging the widely held idea that human sentence processing involves recurrent and immediate processing and provides evidence for cue-based retrieval.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DannyMerkx/next_word_prediction
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Byte Pair Encoding