Urdu text in natural scene images: a new dataset and preliminary text   detection

Hazrat Ali; Khalid Iqbal; Ghulam Mujtaba; Ahmad Fayyaz; Mohammad; Farhad Bulbul; Fazal Wahab Karam; Ali Zahir

arXiv:2109.08060·cs.CV·September 17, 2021

Urdu text in natural scene images: a new dataset and preliminary text detection

Hazrat Ali, Khalid Iqbal, Ghulam Mujtaba, Ahmad Fayyaz, Mohammad, Farhad Bulbul, Fazal Wahab Karam, Ali Zahir

PDF

TL;DR

This paper introduces a new dataset of 500 natural scene images containing Urdu text, along with a novel multi-stage method for Urdu text detection, demonstrating promising initial results and establishing a baseline for future research.

Contribution

The work presents the first Urdu text dataset in natural scenes and a new detection method combining MSER, filtering, and classifiers for improved accuracy.

Findings

01

Good detection performance on test images

02

Dataset will be publicly available for research

03

Baseline established for Urdu text detection

Abstract

Text detection in natural scene images for content analysis is an interesting task. The research community has seen some great developments for English/Mandarin text detection. However, Urdu text extraction in natural scene images is a task not well addressed. In this work, firstly, a new dataset is introduced for Urdu text in natural scene images. The dataset comprises of 500 standalone images acquired from real scenes. Secondly, the channel enhanced Maximally Stable Extremal Region (MSER) method is applied to extract Urdu text regions as candidates in an image. Two-stage filtering mechanism is applied to eliminate non-candidate regions. In the first stage, text and noise are classified based on their geometric properties. In the second stage, a support vector machine classifier is trained to discard non-text candidate regions. After this, text candidate regions are linked using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest