Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection
Xu Han, Junyu Gao, Chuang Yang, Yuan Yuan, Qi Wang

TL;DR
This paper introduces a novel multi-information level text detection method that combines instance-level and region-level features to improve accuracy and robustness in detecting arbitrarily-shaped scene text.
Contribution
It proposes the focus entirety module (FEM) and perceive environment module (PEM) to enhance pixel cohesion and environment perception, addressing noise and scale issues in segmentation-based text detection.
Findings
Outperforms state-of-the-art methods on four datasets
Effective handling of varying text scales and shapes
Improved pixel-level accuracy in text detection
Abstract
Due to the diversity of scene text in aspects such as font, color, shape, and size, accurately and efficiently detecting text is still a formidable challenge. Among the various detection approaches, segmentation-based approaches have emerged as prominent contenders owing to their flexible pixel-level predictions. However, these methods typically model text instances in a bottom-up manner, which is highly susceptible to noise. In addition, the prediction of pixels is isolated without introducing pixel-feature interaction, which also influences the detection performance. To alleviate these problems, we propose a multi-information level arbitrary-shaped text detector consisting of a focus entirety module (FEM) and a perceive environment module (PEM). The former extracts instance-level features and adopts a top-down scheme to model texts to reduce the influence of noises. Specifically, it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques
MethodsFocus
