Fonts-2-Handwriting: A Seed-Augment-Train framework for universal digit classification
Vinay Uday Prabhu, Sanghyun Han, Dian Ang Yap, Mihail Douhaniaris,, Preethi Seshadri, John Whaley

TL;DR
This paper introduces a novel framework that generates synthetic training data from open font datasets to train neural networks for universal handwritten digit classification across multiple Indic scripts, demonstrating effective transfer learning.
Contribution
The paper presents a Seed-Augment-Train/Transfer framework that creates synthetic datasets from font files for training models on handwritten digits in diverse scripts, enabling universal digit recognition.
Findings
CNN trained on synthetic data performs well on real handwritten digits
GAN generates realistic digit images in five Indic scripts
Framework bridges font datasets and transfer learning for digit classification
Abstract
In this paper, we propose a Seed-Augment-Train/Transfer (SAT) framework that contains a synthetic seed image dataset generation procedure for languages with different numeral systems using freely available open font file datasets. This seed dataset of images is then augmented to create a purely synthetic training dataset, which is in turn used to train a deep neural network and test on held-out real world handwritten digits dataset spanning five Indic scripts, Kannada, Tamil, Gujarati, Malayalam, and Devanagari. We showcase the efficacy of this approach both qualitatively, by training a Boundary-seeking GAN (BGAN) that generates realistic digit images in the five languages, and also quantitatively by testing a CNN trained on the synthetic data on the real-world datasets. This establishes not only an interesting nexus between the font-datasets-world and transfer learning but also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Video Analysis and Summarization · Digital Media Forensic Detection
MethodsConvolution · Dogecoin Customer Service Number +1-833-534-1729
