Self-Training of Handwritten Word Recognition for Synthetic-to-Real Adaptation
Fabian Wolf, Gernot A. Fink

TL;DR
This paper introduces a self-training method for handwritten word recognition that adapts models trained on synthetic data to real unlabeled data, reducing the need for manual annotations and improving performance.
Contribution
It presents a novel self-training approach that leverages synthetic and unlabeled data for handwritten text recognition, eliminating the need for manual labels.
Findings
Effective adaptation from synthetic to real data.
Reduces reliance on manually annotated samples.
Closes performance gap with fully-supervised models.
Abstract
Performances of Handwritten Text Recognition (HTR) models are largely determined by the availability of labeled and representative training samples. However, in many application scenarios labeled samples are scarce or costly to obtain. In this work, we propose a self-training approach to train a HTR model solely on synthetic samples and unlabeled data. The proposed training scheme uses an initial model trained on synthetic data to make predictions for the unlabeled target dataset. Starting from this initial model with rather poor performance, we show that a considerable adaptation is possible by training against the predicted pseudo-labels. Moreover, the investigated self-training strategy does not require any manually annotated training samples. We evaluate the proposed method on four widely used benchmark datasets and show its effectiveness on closing the gap to a model trained in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Human Pose and Action Recognition · Multimodal Machine Learning Applications
