Towards Deployable OCR models for Indic languages

Minesh Mathew; Ajoy Mondal; CV Jawahar

arXiv:2205.06740·cs.CV·December 19, 2024·1 cites

Towards Deployable OCR models for Indic languages

Minesh Mathew, Ajoy Mondal, CV Jawahar

PDF

Open Access 1 Models

TL;DR

This paper conducts a comprehensive empirical study of CTC-based neural network models for OCR in 13 Indian languages, introducing a new dataset and outperforming existing OCR tools in most languages.

Contribution

It provides a detailed analysis of neural network models for Indic OCR, compares recognition units, and introduces the Mozhi dataset for benchmarking.

Findings

01

Models outperform public OCR tools in 8 of 13 languages.

02

Synthetic data improves recognition accuracy.

03

Line vs word recognition units impact performance.

Abstract

Recognition of text on word or line images, without the need for sub-word segmentation has become the mainstream of research and development of text recognition for Indian languages. Modelling unsegmented sequences using Connectionist Temporal Classification (CTC) is the most commonly used approach for segmentation-free OCR. In this work we present a comprehensive empirical study of various neural network models that uses CTC for transcribing step-wise predictions in the neural network output to a Unicode sequence. The study is conducted for 13 Indian languages, using an internal dataset that has around 1000 pages per language. We study the choice of line vs word as the recognition unit, and use of synthetic data to train the models. We compare our models with popular publicly available OCR tools for end-to-end document image recognition. Our end-to-end pipeline that employ our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
MWirelabs/assamese-ocr
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Speech Recognition and Synthesis