Real-time Scene Text Detection Based on Global Level and Word Level   Features

Fuqiang Zhao; Jionghua Yu; Enjun Xing; Wenming Song; and Xue Xu

arXiv:2203.05251·cs.CV·March 11, 2022·1 cites

Real-time Scene Text Detection Based on Global Level and Word Level Features

Fuqiang Zhao, Jionghua Yu, Enjun Xing, Wenming Song, and Xue Xu

PDF

Open Access

TL;DR

This paper introduces GWNet, a scene text detection framework that combines global and word-level features to improve accuracy and efficiency in detecting arbitrary shape text in natural scenes, outperforming existing methods.

Contribution

The paper proposes a novel GWNet framework with global and RCNN modules, enhancing adaptive performance and feature fusion for more accurate scene text detection.

Findings

01

Achieved high F-measures on four benchmark datasets.

02

Outperformed state-of-the-art detectors in accuracy.

03

Demonstrated effectiveness of global and word-level feature fusion.

Abstract

It is an extremely challenging task to detect arbitrary shape text in natural scenes on high accuracy and efficiency. In this paper, we propose a scene text detection framework, namely GWNet, which mainly includes two modules: Global module and RCNN module. Specifically, Global module improves the adaptive performance of the DB (Differentiable Binarization) module by adding k submodule and shift submodule. Two submodules enhance the adaptability of amplifying factor k, accelerate the convergence of models and help to produce more accurate detection results. RCNN module fuses global-level and word-level features. The word-level label is generated by obtaining the minimum axis-aligned rectangle boxes of the shrunk polygon. In the inference period, GWNet only uses global-level features to output simple polygon detections. Experiments on four benchmark datasets, including the MSRA-TD500,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Processing and 3D Reconstruction