Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition
Panagiotis Kaliosis, John Pavlopoulos

TL;DR
This paper introduces a novel loss function using Wasserstein distance to align character frequency distributions, improving handwritten text recognition robustness across different temporal and regional datasets.
Contribution
It proposes a new distribution alignment loss and a decoding scheme that enhance model robustness without retraining, addressing distribution shifts in handwritten text recognition.
Findings
Improved accuracy across diverse datasets.
Enhanced robustness to temporal and regional shifts.
Effective at inference time without retraining.
Abstract
Handwritten text recognition aims to convert visual input into machine-readable text, and it remains challenging due to the evolving and context-dependent nature of handwriting. Character sets change over time, and character frequency distributions shift across historical periods or regions, often causing models trained on broad, heterogeneous corpora to underperform on specific subsets. To tackle this, we propose a novel loss function that incorporates the Wasserstein distance between the character frequency distribution of the predicted text and a target distribution empirically derived from training data. By penalizing divergence from expected distributions, our approach enhances both accuracy and robustness under temporal and contextual intra-dataset shifts. Furthermore, we demonstrate that character distribution alignment can also improve existing models at inference time without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHandwritten Text Recognition Techniques · Multimodal Machine Learning Applications · Advanced Neural Network Applications
