Taxonomic Loss for Morphological Glossing of Low-Resource Languages
Michael Ginn, Alexis Palmer

TL;DR
This paper introduces a taxonomic loss function that leverages morphological information to improve low-resource language glossing, especially in human-in-the-loop annotation scenarios, despite not improving single-label accuracy.
Contribution
The paper proposes a novel taxonomic loss function that enhances morphological glossing performance for low-resource languages by exploiting morphological taxonomies.
Findings
Better top-n prediction performance with taxonomic loss
No significant improvement in single-label accuracy
Potential usefulness in human-in-the-loop annotation
Abstract
Morpheme glossing is a critical task in automated language documentation and can benefit other downstream applications greatly. While state-of-the-art glossing systems perform very well for languages with large amounts of existing data, it is more difficult to create useful models for low-resource languages. In this paper, we propose the use of a taxonomic loss function that exploits morphological information to make morphological glossing more performant when data is scarce. We find that while the use of this loss function does not outperform a standard loss function with regards to single-label prediction accuracy, it produces better predictions when considering the top-n predicted labels. We suggest this property makes the taxonomic loss function useful in a human-in-the-loop annotation setting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Music and Audio Processing
