Rational Communication Shapes Morphological Composition
Fengyuan Yang, Yongqian Peng, Yuxi Ma, Chenheng Xu, Yixin Zhu

TL;DR
This study models how human languages choose specific morpheme combinations in words based on a trade-off between listener clarity and speaker effort, using a rational speech act framework and historical English data.
Contribution
It introduces a formal model predicting morphological composition choices as rational trade-offs, supported by empirical analysis of historical and contemporary English compounds.
Findings
Attested morphological compositions are systematically preferred over plausible alternatives.
Models combining semantic informativeness and production cost outperform simpler models.
Pragmatic speaker models improve prediction accuracy as candidate sets expand.
Abstract
Human languages expand vocabularies by combining existing morphemes rather than inventing arbitrary forms. Communicative efficiency shapes lexical systems at multiple levels (Gibson et al., 2019), yet morphological composition -- combining morphemes through compounding or affixation -- has rarely been modeled as a historically situated speaker choice among competing morpheme sequences, leaving unanswered why a language settles on one morpheme combination over other plausible alternatives. We ask whether a trade-off between listener recoverability and speaker production cost can predict attested compositions over contemporaneously available alternatives. Here we show, within the Rational Speech Act (RSA) framework (Frank & Goodman, 2012; Goodman & Frank, 2016) using a time-indexed lexicon constructed from Corpus of Historical American English (COHA) and Corpus of Contemporary American…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
