Classifier identification in Ancient Egyptian as a low-resource   sequence-labelling task

Dmitry Nikolaev; Jorke Grotenhuis; Haleli Harel; Orly Goldwasser

arXiv:2407.00475·cs.CL·July 2, 2024

Classifier identification in Ancient Egyptian as a low-resource sequence-labelling task

Dmitry Nikolaev, Jorke Grotenhuis, Haleli Harel, Orly Goldwasser

PDF

Open Access 1 Video

TL;DR

This paper explores the challenge of identifying classifiers in Ancient Egyptian texts using sequence-labelling neural models, demonstrating promising results despite limited training data and addressing unique tokenisation issues.

Contribution

It introduces the first neural sequence-labelling approach for classifier identification in Ancient Egyptian, a low-resource and complex language processing task.

Findings

01

Neural models outperform frequency-based baselines.

02

Promising performance achieved with modest training data.

03

Addresses tokenisation and operational challenges in AE texts.

Abstract

The complex Ancient Egyptian (AE) writing system was characterised by widespread use of graphemic classifiers (determinatives): silent (unpronounced) hieroglyphic signs clarifying the meaning or indicating the pronunciation of the host word. The study of classifiers has intensified in recent years with the launch and quick growth of the iClassifier project, a web-based platform for annotation and analysis of classifiers in ancient and modern languages. Thanks to the data contributed by the project participants, it is now possible to formulate the identification of classifiers in AE texts as an NLP task. In this paper, we make first steps towards solving this task by implementing a series of sequence-labelling neural models, which achieve promising performance despite the modest amount of training data. We discuss tokenisation and operationalisation issues arising from tackling AE texts…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Classifier identification in Ancient Egyptian as a low-resource sequence-labelling task· underline

Taxonomy

TopicsNatural Language Processing Techniques · Handwritten Text Recognition Techniques · Ancient Egypt and Archaeology

MethodsAutoencoders