Focus Entirety and Perceive Environment for Arbitrary-Shaped Text   Detection

Xu Han; Junyu Gao; Chuang Yang; Yuan Yuan; Qi Wang

arXiv:2409.16827·cs.CV·September 26, 2024

Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection

Xu Han, Junyu Gao, Chuang Yang, Yuan Yuan, Qi Wang

PDF

Open Access

TL;DR

This paper introduces a novel multi-information level text detection method that combines instance-level and region-level features to improve accuracy and robustness in detecting arbitrarily-shaped scene text.

Contribution

It proposes the focus entirety module (FEM) and perceive environment module (PEM) to enhance pixel cohesion and environment perception, addressing noise and scale issues in segmentation-based text detection.

Findings

01

Outperforms state-of-the-art methods on four datasets

02

Effective handling of varying text scales and shapes

03

Improved pixel-level accuracy in text detection

Abstract

Due to the diversity of scene text in aspects such as font, color, shape, and size, accurately and efficiently detecting text is still a formidable challenge. Among the various detection approaches, segmentation-based approaches have emerged as prominent contenders owing to their flexible pixel-level predictions. However, these methods typically model text instances in a bottom-up manner, which is highly susceptible to noise. In addition, the prediction of pixels is isolated without introducing pixel-feature interaction, which also influences the detection performance. To alleviate these problems, we propose a multi-information level arbitrary-shaped text detector consisting of a focus entirety module (FEM) and a perceive environment module (PEM). The former extracts instance-level features and adopts a top-down scheme to model texts to reduce the influence of noises. Specifically, it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques

MethodsFocus