DiffusionPen: Towards Controlling the Style of Handwritten Text Generation
Konstantina Nikolaidou, George Retsinas, Giorgos Sfikas and, Marcus Liwicki

TL;DR
DiffusionPen introduces a novel 5-shot style handwritten text generation method using Latent Diffusion Models, effectively capturing style and text variability to produce realistic samples and improve handwriting recognition systems.
Contribution
The paper presents a hybrid style extractor and data variation strategies, advancing the capabilities of diffusion models for diverse and unseen handwritten text generation.
Findings
Outperforms existing methods qualitatively and quantitatively
Generated data improves handwriting text recognition performance
Enhances robustness and diversity of handwritten text generation
Abstract
Handwritten Text Generation (HTG) conditioned on text and style is a challenging task due to the variability of inter-user characteristics and the unlimited combinations of characters that form new words unseen during training. Diffusion Models have recently shown promising results in HTG but still remain under-explored. We present DiffusionPen (DiffPen), a 5-shot style handwritten text generation approach based on Latent Diffusion Models. By utilizing a hybrid style extractor that combines metric learning and classification, our approach manages to capture both textual and stylistic characteristics of seen and unseen words and styles, generating realistic handwritten samples. Moreover, we explore several variation strategies of the data with multi-style mixtures and noisy embeddings, enhancing the robustness and diversity of the generated data. Extensive experiments using IAM offline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Human Motion and Animation · Natural Language Processing Techniques
MethodsDiffusion
