Frequency matters: Modeling irregular morphological patterns in Spanish with Transformers
Akhilesh Kakolu Ramarao, Kevin Tang, Dinah Baer-Henney

TL;DR
This paper investigates how transformer models learn irregular Spanish verb patterns, emphasizing the importance of input frequency and revealing how models handle irregularities and regularization in morphological inflection tasks.
Contribution
It introduces a systematic analysis of frequency effects on irregular verb pattern learning in transformers, highlighting the impact of input distribution on morphological modeling.
Findings
Models perform better on irregular L-shaped verbs in uneven frequency conditions.
Primacy effects are observed, but no consistent recency effects.
Memorization increases with the proportion of irregular verbs.
Abstract
Over the past decade, various studies have addressed how speakers solve the so-called `The Paradigm Cell Filling Problem' (PCFP) \citep{ackerman2009parts} across different languages. The PCFP addresses a fundamental question in morphological processing: how do speakers accurately generate inflected forms of words when presented with incomplete paradigms? This problem is particularly salient when modeling complex inflectional systems. We focus on Spanish verbal paradigms, where certain verbs follow an irregular L-shaped pattern, where the first-person singular present indicative stem matches the stem used throughout the present subjunctive mood. We formulate the problem as a morphological reinflection task. Specifically, we investigate the role of input frequency in the acquisition of regular versus irregular L-shaped patterns in transformer models. By systematically manipulating the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Phonetics and Phonology Research
