Fine-tuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition
Jan Koh\'ut, Michal Hradi\v{s}

TL;DR
This paper demonstrates that simple fine-tuning with data augmentation is an effective and robust domain adaptation method for handwriting recognition neural networks, outperforming more complex approaches especially with small target datasets.
Contribution
It reveals that straightforward fine-tuning is a surprisingly strong baseline for domain adaptation in handwriting recognition, with detailed analysis on various factors affecting performance.
Findings
Fine-tuning improves CER by up to 50% on new writers.
Data augmentation enhances fine-tuning effectiveness.
Fine-tuning is resistant to overfitting even with small datasets.
Abstract
In many machine learning tasks, a large general dataset and a small specialized dataset are available. In such situations, various domain adaptation methods can be used to adapt a general model to the target dataset. We show that in the case of neural networks trained for handwriting recognition using CTC, simple fine-tuning with data augmentation works surprisingly well in such scenarios and that it is resistant to overfitting even for very small target domain datasets. We evaluated the behavior of fine-tuning with respect to augmentation, training data size, and quality of the pre-trained network, both in writer-dependent and writer-independent settings. On a large real-world dataset, fine-tuning on new writers provided an average relative CER improvement of 25 % for 16 text lines and 50 % for 256 text lines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Topic Modeling
