How do we get there? Evaluating transformer neural networks as cognitive models for English past tense inflection
Xiaomeng Ma, Lingyu Gao

TL;DR
This study evaluates transformer neural networks' ability to model English past tense inflection, revealing they learn some regularity patterns but do not fully replicate human-like understanding, especially on irregular verbs.
Contribution
The paper systematically examines transformer models' behavior on past tense inflection, highlighting their partial symbolic learning and limitations compared to human cognition.
Findings
High accuracy on unseen regular verbs
Some accuracy on unseen irregular verbs
Weak correlation with human nonce verb behavior
Abstract
There is an ongoing debate on whether neural networks can grasp the quasi-regularities in languages like humans. In a typical quasi-regularity task, English past tense inflections, the neural network model has long been criticized that it learns only to generalize the most frequent pattern, but not the regular pattern, thus can not learn the abstract categories of regular and irregular and is dissimilar to human performance. In this work, we train a set of transformer models with different settings to examine their behavior on this task. The models achieved high accuracy on unseen regular verbs and some accuracy on unseen irregular verbs. The models' performance on the regulars is heavily affected by type frequency and ratio but not token frequency and ratio, and vice versa for the irregulars. The different behaviors on the regulars and irregulars suggest that the models have some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Natural Language Processing Techniques · Topic Modeling
