Morphological Reinflection with Multiple Arguments: An Extended   Annotation schema and a Georgian Case Study

David Guriel; Omer Goldman; Reut Tsarfaty

arXiv:2203.08527·cs.CL·March 22, 2022

Morphological Reinflection with Multiple Arguments: An Extended Annotation schema and a Georgian Case Study

David Guriel, Omer Goldman, Reut Tsarfaty

PDF

Open Access

TL;DR

This paper extends the UniMorph morphological annotation schema to better handle complex argument marking, demonstrates its application on Georgian, and shows that this improves dataset coverage and interpretability for morphological reinflection tasks.

Contribution

The paper introduces a hierarchical annotation schema for morphological data, applied to Georgian, significantly increasing dataset size and coverage, and analyzing the impact on reinflection model performance.

Findings

01

Extended schema captures complex argument marking effectively.

02

Georgian dataset size increased fourfold, verb forms sixfold.

03

Reinflection generalization is easier at form level than lemma level.

Abstract

In recent years, a flurry of morphological datasets had emerged, most notably UniMorph, a multi-lingual repository of inflection tables. However, the flat structure of the current morphological annotation schema makes the treatment of some languages quirky, if not impossible, specifically in cases of polypersonal agreement, where verbs agree with multiple arguments using true affixes. In this paper, we propose to address this phenomenon by expanding the UniMorph annotation schema to a hierarchical feature structure that naturally accommodates complex argument marking. We apply this extended schema to one such language, Georgian, and provide a human-verified, accurate and balanced morphological dataset for Georgian verbs. The dataset has 4 times more tables and 6 times more verb forms compared to the existing UniMorph dataset, covering all possible variants of argument marking,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification