Morphological Processing of Low-Resource Languages: Where We Are and What's Next
Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya D., McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann

TL;DR
This paper surveys recent advances in computational morphology for low-resource languages and presents an empirical study on unsupervised morphological analysis from raw text, highlighting progress and future challenges.
Contribution
It reviews current methods for low-resource morphological processing and introduces an empirical study on unsupervised paradigm completion, proposing new models and identifying future research directions.
Findings
State-of-the-art models perform reasonably but need improvement
Unsupervised paradigm completion can significantly expand morphological resources
New models show promise but highlight remaining challenges
Abstract
Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages. Having long been multilingual, the field of computational morphology is increasingly moving towards approaches suitable for languages with minimal or no annotated resources. First, we survey recent developments in computational morphology with a focus on low-resource languages. Second, we argue that the field is ready to tackle the logical next challenge: understanding a language's morphology from raw text alone. We perform an empirical study on a truly unsupervised version of the paradigm completion task and show that, while existing state-of-the-art models bridged by two newly proposed models we devise perform reasonably, there is still much room for improvement. The stakes are high:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
