Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam
Minesh Mathew, Mohit Jain, CV Jawahar

TL;DR
This paper benchmarks scene text recognition for Devanagari, Telugu, and Malayalam scripts using a deep learning model trained on synthetic data and tested on real images, establishing a new dataset and evaluation framework.
Contribution
It introduces a new benchmark dataset for Indic scripts and demonstrates an end-to-end CNN-RNN model trained with synthetic data for scene text recognition.
Findings
The CNN-RNN model achieves promising recognition accuracy on real scene images.
Synthetic data effectively trains the recognition system for Indic scripts.
The benchmark provides a foundation for future research in Indic script scene text recognition.
Abstract
Inspired by the success of Deep Learning based approaches to English scene text recognition, we pose and benchmark scene text recognition for three Indic scripts - Devanagari, Telugu and Malayalam. Synthetic word images rendered from Unicode fonts are used for training the recognition system. And the performance is bench-marked on a new IIIT-ILST dataset comprising of hundreds of real scene images containing text in the above mentioned scripts. We use a segmentation free, hybrid but end-to-end trainable CNN-RNN deep neural network for transcribing the word images to the corresponding texts. The cropped word images need not be segmented into the sub-word units and the error is calculated and backpropagated for the the given word image at once. The network is trained using CTC loss, which is proven quite effective for sequence-to-sequence transcription tasks. The CNN layers in the network…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Image Processing and 3D Reconstruction
