Spotlight Text Detector: Spotlight on Candidate Regions Like a Camera

Xu Han; Junyu Gao; Chuang Yang; Yuan Yuan; and Qi Wang

arXiv:2409.16820·cs.CV·September 26, 2024

Spotlight Text Detector: Spotlight on Candidate Regions Like a Camera

Xu Han, Junyu Gao, Chuang Yang, Yuan Yuan, and Qi Wang

PDF

Open Access

TL;DR

This paper introduces the Spotlight Text Detector (STD), a novel approach that improves scene text detection by focusing on candidate regions and leveraging multiple geometric features, outperforming existing methods.

Contribution

The paper proposes the STD with a spotlight calibration module and multivariate information extraction, addressing issues of false positives and geometric variability in scene text detection.

Findings

01

STD outperforms state-of-the-art methods on multiple datasets.

02

The spotlight calibration module reduces false positives.

03

Using multiple geometric features enhances detection accuracy.

Abstract

The irregular contour representation is one of the tough challenges in scene text detection. Although segmentation-based methods have achieved significant progress with the help of flexible pixel prediction, the overlap of geographically close texts hinders detecting them separately. To alleviate this problem, some shrink-based methods predict text kernels and expand them to restructure texts. However, the text kernel is an artificial object with incomplete semantic features that are prone to incorrect or missing detection. In addition, different from the general objects, the geometry features (aspect ratio, scale, and shape) of scene texts vary significantly, which makes it difficult to detect them accurately. To consider the above problems, we propose an effective spotlight text detector (STD), which consists of a spotlight calibration module (SCM) and a multivariate information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedia Influence and Politics

MethodsFocus · Spatial-Channel Token Distillation