Quo Vadis Handwritten Text Generation for Handwritten Text Recognition?

Vittorio Pippi; Konstantina Nikolaidou; Silvia Cascianelli; George Retsinas; Giorgos Sfikas; Rita Cucchiara; Marcus Liwicki

arXiv:2508.09936·cs.CV·August 14, 2025

Quo Vadis Handwritten Text Generation for Handwritten Text Recognition?

Vittorio Pippi, Konstantina Nikolaidou, Silvia Cascianelli, George Retsinas, Giorgos Sfikas, Rita Cucchiara, Marcus Liwicki

PDF

TL;DR

This paper systematically evaluates three advanced handwritten text generation models to determine their effectiveness in improving handwritten text recognition, especially for small, diverse datasets, providing practical guidelines for model selection.

Contribution

It offers a comprehensive comparison of state-of-the-art HTG models and analyzes their impact on HTR performance in low-resource scenarios.

Findings

01

Diffusion-based HTG yields the best HTR fine-tuning results.

02

Synthetic data characteristics significantly influence recognition accuracy.

03

Guidelines for selecting effective HTG models are provided.

Abstract

The digitization of historical manuscripts presents significant challenges for Handwritten Text Recognition (HTR) systems, particularly when dealing with small, author-specific collections that diverge from the training data distributions. Handwritten Text Generation (HTG) techniques, which generate synthetic data tailored to specific handwriting styles, offer a promising solution to address these challenges. However, the effectiveness of various HTG models in enhancing HTR performance, especially in low-resource transcription settings, has not been thoroughly evaluated. In this work, we systematically compare three state-of-the-art styled HTG models (representing the generative adversarial, diffusion, and autoregressive paradigms for HTG) to assess their impact on HTR fine-tuning. We analyze how visual and linguistic characteristics of synthetic data influence fine-tuning outcomes and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.