Transfer Learning for Scene Text Recognition in Indian Languages

Sanjana Gunna; Rohit Saluja; C. V. Jawahar

arXiv:2201.03180·cs.CV·January 11, 2022

Transfer Learning for Scene Text Recognition in Indian Languages

Sanjana Gunna, Rohit Saluja, C. V. Jawahar

PDF

TL;DR

This paper explores transfer learning for scene text recognition in Indian languages, demonstrating that transferring models among Indian languages yields better results than from English, and setting new benchmarks with improved recognition rates.

Contribution

The study shows the effectiveness of transfer learning among Indian languages for scene text recognition and introduces new benchmark results for multiple Indian language datasets.

Findings

01

Transfer learning among Indian languages improves recognition accuracy.

02

Models transferred from Indian languages are visually closer to target models than those from English.

03

Achieved significant gains in word recognition rates on multiple datasets.

Abstract

Scene text recognition in low-resource Indian languages is challenging because of complexities like multiple scripts, fonts, text size, and orientations. In this work, we investigate the power of transfer learning for all the layers of deep scene text recognition networks from English to two common Indian languages. We perform experiments on the conventional CRNN model and STAR-Net to ensure generalisability. To study the effect of change in different scripts, we initially run our experiments on synthetic word images rendered using Unicode fonts. We show that the transfer of English models to simple synthetic datasets of Indian languages is not practical. Instead, we propose to apply transfer learning techniques among Indian languages due to similarity in their n-gram distributions and visual features like the vowels and conjunct characters. We then study the transfer learning among six…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Bidirectional LSTM