Handwritten Text Recognition for Low Resource Languages
Sayantan Dey, Alireza Alaei, Partha Pratim Roy

TL;DR
This paper presents BharatOCR, a novel segmentation-free paragraph-level handwritten text recognition system for low-resource languages like Hindi and Urdu, utilizing a ViT-Transformer architecture with language models to improve accuracy and coherence.
Contribution
It introduces a new ViT-Transformer Decoder-LM architecture specifically designed for paragraph-level handwritten recognition in low-resource languages, with a focus on implicit line segmentation.
Findings
Achieved character recognition rates of 96.24% in Urdu and 94.80% in Hindi datasets.
Outperformed several state-of-the-art Urdu text recognition methods.
Provided benchmark results on multiple datasets, demonstrating high accuracy.
Abstract
Despite considerable progress in handwritten text recognition, paragraph-level handwritten text recognition, especially in low-resource languages, such as Hindi, Urdu and similar scripts, remains a challenging problem. These languages, often lacking comprehensive linguistic resources, require special attention to develop robust systems for accurate optical character recognition (OCR). This paper introduces BharatOCR, a novel segmentation-free paragraph-level handwritten Hindi and Urdu text recognition. We propose a ViT-Transformer Decoder-LM architecture for handwritten text recognition, where a Vision Transformer (ViT) extracts visual features, a Transformer decoder generates text sequences, and a pre-trained language model (LM) refines the output to improve accuracy, fluency, and coherence. Our model utilizes a Data-efficient Image Transformer (DeiT) model proposed for masked image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Advanced Neural Network Applications · Vehicle License Plate Recognition
