Morphology Without Borders: Clause-Level Morphology

Omer Goldman; Reut Tsarfaty

arXiv:2202.12832·cs.CL·October 20, 2022

Morphology Without Borders: Clause-Level Morphology

Omer Goldman, Reut Tsarfaty

PDF

Open Access

TL;DR

This paper proposes a shift from word-level to clause-level morphology, introducing a new dataset and tasks that reveal greater complexity and better interface with language models, advancing cross-linguistic morphological analysis.

Contribution

It introduces MightyMorph, a clause-level morphological dataset for four languages, and defines new tasks that improve understanding of morphology in context and across languages.

Findings

01

Clause-level tasks are more challenging than word-level tasks.

02

The dataset covers four typologically diverse languages.

03

Clause-level morphology aligns better with contextual language models.

Abstract

Morphological tasks use large multi-lingual datasets that organize words into inflection tables, which then serve as training and evaluation data for various tasks. However, a closer inspection of these data reveals profound cross-linguistic inconsistencies, that arise from the lack of a clear linguistic and operational definition of what is a word, and that severely impair the universality of the derived tasks. To overcome this deficiency, we propose to view morphology as a clause-level phenomenon, rather than word-level. It is anchored in a fixed yet inclusive set of features, that encapsulates all functions realized in a saturated clause. We deliver MightyMorph, a novel dataset for clause-level morphology covering 4 typologically-different languages: English, German, Turkish and Hebrew. We use this dataset to derive 3 clause-level morphological tasks: inflection, reinflection and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification