A tailored Handwritten-Text-Recognition System for Medieval Latin
Philipp Koch, Gilary Vera Nu\~nez, Esteban Garces Arias and, Christian Heumann, Matthias Sch\"offel, Alexander H\"aberlin and, Matthias A{\ss}enmacher

TL;DR
This paper presents a specialized handwritten text recognition system for medieval Latin, combining image segmentation, transformer-based models, and data augmentation to achieve high accuracy surpassing commercial solutions.
Contribution
The work introduces a tailored end-to-end pipeline for medieval Latin HTR, integrating segmentation, transformer models, and extensive data augmentation for improved accuracy.
Findings
Achieved a CER of 0.015, outperforming Google Cloud Vision.
Developed a pipeline for locating, extracting, and transcribing medieval Latin lemmas.
Demonstrated stable performance with a highly competitive model.
Abstract
The Bavarian Academy of Sciences and Humanities aims to digitize its Medieval Latin Dictionary. This dictionary entails record cards referring to lemmas in medieval Latin, a low-resource language. A crucial step of the digitization process is the Handwritten Text Recognition (HTR) of the handwritten lemmas found on these record cards. In our work, we introduce an end-to-end pipeline, tailored to the medieval Latin dictionary, for locating, extracting, and transcribing the lemmas. We employ two state-of-the-art (SOTA) image segmentation models to prepare the initial data set for the HTR task. Furthermore, we experiment with different transformer-based models and conduct a set of experiments to explore the capabilities of different combinations of vision encoders with a GPT-2 decoder. Additionally, we also apply extensive data augmentation resulting in a highly competitive model. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Image Processing and 3D Reconstruction
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Dense Connections · Weight Decay · Dropout · Discriminative Fine-Tuning · Cosine Annealing
