Morphological Reinflection with Multiple Arguments: An Extended Annotation schema and a Georgian Case Study
David Guriel, Omer Goldman, Reut Tsarfaty

TL;DR
This paper extends the UniMorph morphological annotation schema to better handle complex argument marking, demonstrates its application on Georgian, and shows that this improves dataset coverage and interpretability for morphological reinflection tasks.
Contribution
The paper introduces a hierarchical annotation schema for morphological data, applied to Georgian, significantly increasing dataset size and coverage, and analyzing the impact on reinflection model performance.
Findings
Extended schema captures complex argument marking effectively.
Georgian dataset size increased fourfold, verb forms sixfold.
Reinflection generalization is easier at form level than lemma level.
Abstract
In recent years, a flurry of morphological datasets had emerged, most notably UniMorph, a multi-lingual repository of inflection tables. However, the flat structure of the current morphological annotation schema makes the treatment of some languages quirky, if not impossible, specifically in cases of polypersonal agreement, where verbs agree with multiple arguments using true affixes. In this paper, we propose to address this phenomenon by expanding the UniMorph annotation schema to a hierarchical feature structure that naturally accommodates complex argument marking. We apply this extended schema to one such language, Georgian, and provide a human-verified, accurate and balanced morphological dataset for Georgian verbs. The dataset has 4 times more tables and 6 times more verb forms compared to the existing UniMorph dataset, covering all possible variants of argument marking,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
