Separate Scene Text Detector for Unseen Scripts is Not All You Need

Prateek Keserwani; Taveena Lotey; Rohit Keshari; and Partha Pratim Roy

arXiv:2307.15991·cs.CV·August 1, 2023

Separate Scene Text Detector for Unseen Scripts is Not All You Need

Prateek Keserwani, Taveena Lotey, Rohit Keshari, and Partha Pratim Roy

PDF

Open Access

TL;DR

This paper explores the challenge of detecting unseen scripts in scene text detection, proposing a vector embedding approach that leverages stroke information to improve zero-shot detection performance across different scripts.

Contribution

The paper introduces a novel vector embedding method for cross-script scene text detection, addressing the problem of unseen script detection without requiring separate training for each script.

Findings

01

Vector embedding of stroke information improves unseen script detection

02

Annotation consistency is crucial for cross-script detection performance

03

Proposed method shows promising results in zero-shot setting on multi-lingual dataset

Abstract

Text detection in the wild is a well-known problem that becomes more challenging while handling multiple scripts. In the last decade, some scripts have gained the attention of the research community and achieved good detection performance. However, many scripts are low-resourced for training deep learning-based scene text detectors. It raises a critical question: Is there a need for separate training for new scripts? It is an unexplored query in the field of scene text detection. This paper acknowledges this problem and proposes a solution to detect scripts not present during training. In this work, the analysis has been performed to understand cross-script text detection, i.e., trained on one and tested on another. We found that the identical nature of text annotation (word-level/line-level) is crucial for better cross-script text detection. The different nature of text annotation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Processing and 3D Reconstruction