Learning the Ordering of Coordinate Compounds and Elaborate Expressions in Hmong, Lahu, and Chinese
Chenxuan Cui, Katherine J. Zhang, David R. Mortensen

TL;DR
This paper investigates how coordinate compounds and elaborate expressions are ordered in East Asian languages, showing that computational models can learn these orderings through phonology and lexical cues, revealing multiple learning routes.
Contribution
It demonstrates that decision trees and SVMs can predict CC and EE orderings based on phonology, while neural models can learn EE orderings without phonological input.
Findings
Decision trees learn hierarchies similar to Mortensen's phonological hierarchies.
Neural sequence models effectively learn EE orderings without phonological data.
Multiple routes, including phonology and lexical distribution, can explain EE ordering.
Abstract
Coordinate compounds (CCs) and elaborate expressions (EEs) are coordinate constructions common in languages of East and Southeast Asia. Mortensen (2006) claims that (1) the linear ordering of EEs and CCs in Hmong, Lahu, and Chinese can be predicted via phonological hierarchies and (2) these phonological hierarchies lack a clear phonetic rationale. These claims are significant because morphosyntax has often been seen as in a feed-forward relationship with phonology, and phonological generalizations have often been assumed to be phonetically "natural". We investigate whether the ordering of CCs and EEs can be learned empirically and whether computational models (classifiers and sequence labeling models) learn unnatural hierarchies similar to those posited by Mortensen (2006). We find that decision trees and SVMs learn to predict the order of CCs/EEs on the basis of phonology, with DTs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Linguistic Variation and Morphology · Syntax, Semantics, Linguistic Variation
