Error Patterns in Historical OCR: A Comparative Analysis of TrOCR and a Vision-Language Model

Ari Vesalainen; Eetu M\"akel\"a; Laura Ruotsalainen; and Mikko Tolonen

arXiv:2602.14524·cs.CV·February 17, 2026

Error Patterns in Historical OCR: A Comparative Analysis of TrOCR and a Vision-Language Model

Ari Vesalainen, Eetu M\"akel\"a, Laura Ruotsalainen, and Mikko Tolonen

PDF

Open Access

TL;DR

This paper compares transformer-based OCR systems, TrOCR and Qwen, on historical texts, revealing differences in error patterns, robustness, and fidelity, emphasizing the importance of architecture-aware evaluation for scholarly use.

Contribution

It provides a systematic analysis of OCR error structures in historical texts, highlighting how model architecture influences error types and implications for scholarly digitization.

Findings

01

Qwen achieves lower CER/WER and is more robust to degraded input.

02

TrOCR maintains orthographic fidelity but is prone to cascading errors.

03

Model architecture biases affect error locality and detectability.

Abstract

Optical Character Recognition (OCR) of eighteenth-century printed texts remains challenging due to degraded print quality, archaic glyphs, and non-standardized orthography. Although transformer-based OCR systems and Vision-Language Models (VLMs) achieve strong aggregate accuracy, metrics such as Character Error Rate (CER) and Word Error Rate (WER) provide limited insight into their reliability for scholarly use. We compare a dedicated OCR transformer (TrOCR) and a general-purpose Vision-Language Model (Qwen) on line-level historical English texts using length-weighted accuracy metrics and hypothesis driven error analysis. While Qwen achieves lower CER/WER and greater robustness to degraded input, it exhibits selective linguistic regularization and orthographic normalization that may silently alter historically meaningful forms. TrOCR preserves orthographic fidelity more consistently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Digital Humanities and Scholarship · Natural Language Processing Techniques