KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition   Benchmark

Vannkinh Nom; Souhail Bakkali; Muhammad Muzzamil Luqman; Micka\"el; Coustaty; Jean-Marc Ogier

arXiv:2410.18277·cs.CV·October 25, 2024

KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition Benchmark

Vannkinh Nom, Souhail Bakkali, Muhammad Muzzamil Luqman, Micka\"el, Coustaty, Jean-Marc Ogier

PDF

Open Access 1 Datasets

TL;DR

This paper introduces KhmerST, the first Khmer scene text detection and recognition dataset, addressing the lack of resources for low-resourced languages and providing a benchmark for future research.

Contribution

It presents a new, annotated Khmer scene-text dataset with baseline models, enabling progress in low-resource language text detection and recognition.

Findings

01

First Khmer scene-text dataset with 1,544 images

02

Includes diverse text conditions like poor lighting and occlusion

03

Provides baseline detection and recognition models

Abstract

Developing effective scene text detection and recognition models hinges on extensive training data, which can be both laborious and costly to obtain, especially for low-resourced languages. Conventional methods tailored for Latin characters often falter with non-Latin scripts due to challenges like character stacking, diacritics, and variable character widths without clear word boundaries. In this paper, we introduce the first Khmer scene-text dataset, featuring 1,544 expert-annotated images, including 997 indoor and 547 outdoor scenes. This diverse dataset includes flat text, raised text, poorly illuminated text, distant and partially obscured text. Annotations provide line-level text and polygonal bounding box coordinates for each scene. The benchmark includes baseline models for scene-text detection and recognition tasks, providing a robust starting point for future research…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

SoyVitou/KhmerST-Dataset-OCR-Cropped
dataset· 10 dl
10 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques