Lacuna Language Learning: Leveraging RNNs for Ranked Text Completion in   Digitized Coptic Manuscripts

Lauren Levine; Cindy Tung Li; Lydia Bremer-McCollum; Nicholas Wagner,; Amir Zeldes

arXiv:2407.12247·cs.CL·July 18, 2024

Lacuna Language Learning: Leveraging RNNs for Ranked Text Completion in Digitized Coptic Manuscripts

Lauren Levine, Cindy Tung Li, Lydia Bremer-McCollum, Nicholas Wagner,, Amir Zeldes

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a bidirectional RNN model to assist scholars in ranking possible text reconstructions in damaged Coptic manuscripts, improving the process of textual restoration despite accuracy limitations.

Contribution

The study demonstrates the application of RNNs for ranking text reconstructions in damaged manuscripts, offering a new tool for textual analysis in historical linguistics.

Findings

01

Achieved 72% accuracy in single character prediction

02

Reconstruction accuracy drops to 37% for varied lacunae lengths

03

Neural models can effectively assist traditional manuscript restoration methods

Abstract

Ancient manuscripts are frequently damaged, containing gaps in the text known as lacunae. In this paper, we present a bidirectional RNN model for character prediction of Coptic characters in manuscript lacunae. Our best model performs with 72% accuracy on single character reconstruction, but falls to 37% when reconstructing lacunae of various lengths. While not suitable for definitive manuscript reconstruction, we argue that our RNN model can help scholars rank the likelihood of textual reconstructions. As evidence, we use our RNN model to rank reconstructions in two early Coptic manuscripts. Our investigation shows that neural models can augment traditional methods of textual restoration, providing scholars with an additional tool to assess lacunae in Coptic manuscripts.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lauren-lizzy-levine/coptic_char_generator
pytorchOfficial

Videos

Lacuna Language Learning: Leveraging RNNs for Ranked Text Completion in Digitized Coptic Manuscripts· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling