On the Complexity and Typology of Inflectional Morphological Systems
Ryan Cotterell, Christo Kirov, Mans Hulden, Jason Eisner

TL;DR
This paper investigates the complexity of inflectional morphological systems across languages, revealing a trade-off between paradigm size and irregularity, measured through entropy and estimated via variational methods.
Contribution
It introduces a novel quantitative methodology to measure morphological irregularity and demonstrates a universal trade-off across 31 diverse languages.
Findings
Large paradigms tend to be highly irregular or large in size, but not both.
Entropy-based irregularity measure effectively captures paradigm complexity.
Empirical evidence supports the trade-off hypothesis across multiple language families.
Abstract
We quantify the linguistic complexity of different languages' morphological systems. We verify that there is an empirical trade-off between paradigm size and irregularity: a language's inflectional paradigms may be either large in size or highly irregular, but never both. Our methodology measures paradigm irregularity as the entropy of the surface realization of a paradigm -- how hard it is to jointly predict all the surface forms of a paradigm. We estimate this by a variational approximation. Our measurements are taken on large morphological paradigms from 31 typologically diverse languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
