Handwritten Text Recognition of Historical Manuscripts Using Transformer-Based Models

Erez Meoded

arXiv:2508.11499·cs.CV·August 18, 2025

Handwritten Text Recognition of Historical Manuscripts Using Transformer-Based Models

Erez Meoded

PDF

TL;DR

This paper demonstrates that applying transformer-based models with specialized data augmentation and ensemble techniques significantly improves handwritten text recognition accuracy on 16th-century Latin manuscripts, advancing the field of historical document digitization.

Contribution

The study introduces four novel augmentation methods tailored for historical handwriting and evaluates ensemble learning, achieving state-of-the-art results in HTR for archival Latin manuscripts.

Findings

01

Best model achieves CER of 1.86 with augmentation

02

Ensemble approach reduces CER to 1.60, a 42% improvement

03

Domain-specific augmentations significantly enhance recognition accuracy

Abstract

Historical handwritten text recognition (HTR) is essential for unlocking the cultural and scholarly value of archival documents, yet digitization is often hindered by scarce transcriptions, linguistic variation, and highly diverse handwriting styles. In this study, we apply TrOCR, a state-of-the-art transformer-based HTR model, to 16th-century Latin manuscripts authored by Rudolf Gwalther. We investigate targeted image preprocessing and a broad suite of data augmentation techniques, introducing four novel augmentation methods designed specifically for historical handwriting characteristics. We also evaluate ensemble learning approaches to leverage the complementary strengths of augmentation-trained models. On the Gwalther dataset, our best single-model augmentation (Elastic) achieves a Character Error Rate (CER) of 1.86, while a top-5 voting ensemble achieves a CER of 1.60 -…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.