Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Sanjana Gunna; Rohit Saluja; C. V. Jawahar

arXiv:2201.03185·cs.CV·January 11, 2022

Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Sanjana Gunna, Rohit Saluja, C. V. Jawahar

PDF

1 Repo

TL;DR

This paper investigates the reasons behind lower scene-text recognition accuracy in non-Latin languages and proposes data augmentation strategies, including region-based font search, to significantly improve recognition performance.

Contribution

It identifies key factors affecting non-Latin text recognition accuracy and introduces region-based font augmentation to enhance deep learning models for these languages.

Findings

01

Improved WRRs on Arabic datasets by 24.54% and 2.32%.

02

Enhanced recognition rates for Devanagari datasets by 7.88% and 3.72%.

03

Highlighting the importance of font diversity and dataset size in recognition accuracy.

Abstract

Scene-text recognition is remarkably better in Latin languages than the non-Latin languages due to several factors like multiple fonts, simplistic vocabulary statistics, updated data generation tools, and writing systems. This paper examines the possible reasons for low accuracy by comparing English datasets with non-Latin languages. We compare various features like the size (width and height) of the word images and word length statistics. Over the last decade, generating synthetic datasets with powerful deep learning techniques has tremendously improved scene-text recognition. Several controlled experiments are performed on English, by varying the number of (i) fonts to create the synthetic data and (ii) created word images. We discover that these factors are critical for the scene-text recognition systems. The English synthetic datasets utilize over 1400 fonts while Arabic and other…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

firesans/nonlatinphotoocr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.