Integrated Sequence Tagging for Medieval Latin Using Deep Representation Learning
Mike Kestemont, Jeroen De Gussem

TL;DR
This paper introduces an integrated deep learning approach for part-of-speech tagging and lemmatization of medieval Latin, addressing orthographic variation and reducing error propagation compared to traditional lexicon-dependent methods.
Contribution
It proposes a novel layered neural network architecture that jointly performs sequence tagging tasks, improving over cascaded and lexicon-dependent approaches in medieval Latin processing.
Findings
Effective handling of orthographic variation in medieval Latin
Reduction in error propagation compared to traditional methods
Demonstrated improved accuracy in PoS tagging and lemmatization
Abstract
In this paper we consider two sequence tagging tasks for medieval Latin: part-of-speech tagging and lemmatization. These are both basic, yet foundational preprocessing steps in applications such as text re-use detection. Nevertheless, they are generally complicated by the considerable orthographic variation which is typical of medieval Latin. In Digital Classics, these tasks are traditionally solved in a (i) cascaded and (ii) lexicon-dependent fashion. For example, a lexicon is used to generate all the potential lemma-tag pairs for a token, and next, a context-aware PoS-tagger is used to select the most appropriate tag-lemma pair. Apart from the problems with out-of-lexicon items, error percolation is a major downside of such approaches. In this paper we explore the possibility to elegantly solve these tasks using a single, integrated approach. For this, we make use of a layered neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Digital Humanities and Scholarship
