Taxonomic Loss for Morphological Glossing of Low-Resource Languages

Michael Ginn; Alexis Palmer

arXiv:2308.15055·cs.CL·August 30, 2023·1 cites

Taxonomic Loss for Morphological Glossing of Low-Resource Languages

Michael Ginn, Alexis Palmer

PDF

Open Access 1 Repo

TL;DR

This paper introduces a taxonomic loss function that leverages morphological information to improve low-resource language glossing, especially in human-in-the-loop annotation scenarios, despite not improving single-label accuracy.

Contribution

The paper proposes a novel taxonomic loss function that enhances morphological glossing performance for low-resource languages by exploiting morphological taxonomies.

Findings

01

Better top-n prediction performance with taxonomic loss

02

No significant improvement in single-label accuracy

03

Potential usefulness in human-in-the-loop annotation

Abstract

Morpheme glossing is a critical task in automated language documentation and can benefit other downstream applications greatly. While state-of-the-art glossing systems perform very well for languages with large amounts of existing data, it is more difficult to create useful models for low-resource languages. In this paper, we propose the use of a taxonomic loss function that exploits morphological information to make morphological glossing more performant when data is scarce. We find that while the use of this loss function does not outperform a standard loss function with regards to single-label prediction accuracy, it produces better predictions when considering the top-n predicted labels. We suggest this property makes the taxonomic loss function useful in a human-in-the-loop annotation setting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

michaelpginn/taxo-morph
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Music and Audio Processing