Inflecting when there's no majority: Limitations of encoder-decoder   neural networks as cognitive models for German plurals

Kate McCurdy; Sharon Goldwater; Adam Lopez

arXiv:2005.08826·cs.CL·December 21, 2020

Inflecting when there's no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals

Kate McCurdy, Sharon Goldwater, Adam Lopez

PDF

TL;DR

This paper investigates the limitations of encoder-decoder neural networks in modeling German plural inflections, revealing that they tend to favor the most frequent class and fail to replicate human-like variability and regularity in less common inflections.

Contribution

The study introduces a new German plural dataset and demonstrates that encoder-decoder models primarily generalize the most frequent class, highlighting their limitations in minority-class generalization.

Findings

01

Models favor the most frequent plural class.

02

Models do not replicate human-like variability.

03

Neural models struggle with minority-class generalization.

Abstract

Can artificial neural networks learn to represent inflectional morphology and generalize to new words as human speakers do? Kirov and Cotterell (2018) argue that the answer is yes: modern Encoder-Decoder (ED) architectures learn human-like behavior when inflecting English verbs, such as extending the regular past tense form -(e)d to novel words. However, their work does not address the criticism raised by Marcus et al. (1995): that neural models may learn to extend not the regular, but the most frequent class -- and thus fail on tasks like German number inflection, where infrequent suffixes like -s can still be productively generalized. To investigate this question, we first collect a new dataset from German speakers (production and ratings of plural forms for novel nouns) that is designed to avoid sources of information unavailable to the ED model. The speaker data show high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.