A Permuted Autoregressive Approach to Word-Level Recognition for Urdu Digital Text
Ahmed Mustafa, Muhammad Tahir Rafique, Muhammad Ijlal Baig, Hasan, Sajid, Muhammad Jawad Khan, Karam Dad Kallu

TL;DR
This paper presents a novel transformer-based OCR model for Urdu that uses permuted autoregressive sequences to improve recognition accuracy amidst script complexities.
Contribution
Introduces a permuted autoregressive architecture for Urdu OCR, enabling better handling of script variations and overlapping characters compared to existing models.
Findings
Achieved a CER of 0.178 on Urdu text images.
Demonstrated superior accuracy over traditional OCR methods.
Effective in managing character reordering and script variations.
Abstract
This research paper introduces a novel word-level Optical Character Recognition (OCR) model specifically designed for digital Urdu text, leveraging transformer-based architectures and attention mechanisms to address the distinct challenges of Urdu script recognition, including its diverse text styles, fonts, and variations. The model employs a permuted autoregressive sequence (PARSeq) architecture, which enhances its performance by enabling context-aware inference and iterative refinement through the training of multiple token permutations. This method allows the model to adeptly manage character reordering and overlapping characters, commonly encountered in Urdu script. Trained on a dataset comprising approximately 160,000 Urdu text images, the model demonstrates a high level of accuracy in capturing the intricacies of Urdu script, achieving a CER of 0.178. Despite ongoing challenges…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Handwritten Text Recognition Techniques · Speech Recognition and Synthesis
MethodsSoftmax · Attention Is All You Need · Focus
