Transfer Learning for Scene Text Recognition in Indian Languages
Sanjana Gunna, Rohit Saluja, C. V. Jawahar

TL;DR
This paper explores transfer learning for scene text recognition in Indian languages, demonstrating that transferring models among Indian languages yields better results than from English, and setting new benchmarks with improved recognition rates.
Contribution
The study shows the effectiveness of transfer learning among Indian languages for scene text recognition and introduces new benchmark results for multiple Indian language datasets.
Findings
Transfer learning among Indian languages improves recognition accuracy.
Models transferred from Indian languages are visually closer to target models than those from English.
Achieved significant gains in word recognition rates on multiple datasets.
Abstract
Scene text recognition in low-resource Indian languages is challenging because of complexities like multiple scripts, fonts, text size, and orientations. In this work, we investigate the power of transfer learning for all the layers of deep scene text recognition networks from English to two common Indian languages. We perform experiments on the conventional CRNN model and STAR-Net to ensure generalisability. To study the effect of change in different scripts, we initially run our experiments on synthetic word images rendered using Unicode fonts. We show that the transfer of English models to simple synthetic datasets of Indian languages is not practical. Instead, we propose to apply transfer learning techniques among Indian languages due to similarity in their n-gram distributions and visual features like the vowels and conjunct characters. We then study the transfer learning among six…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Bidirectional LSTM
