Explicit Relational Reasoning Network for Scene Text Detection

Yuchen Su; Zhineng Chen; Yongkun Du; Zhilong Ji; Kai Hu; Jinfeng Bai,; Xieping Gao

arXiv:2412.14692·cs.CV·February 10, 2025

Explicit Relational Reasoning Network for Scene Text Detection

Yuchen Su, Zhineng Chen, Yongkun Du, Zhilong Ji, Kai Hu, Jinfeng Bai,, Xieping Gao

PDF

Open Access 1 Video

TL;DR

ERRNet introduces an end-to-end scene text detection method that models component relationships explicitly, eliminating post-processing and achieving state-of-the-art accuracy with high efficiency.

Contribution

The paper proposes ERRNet, a novel relational reasoning network that treats text components as objects in a tracking framework, removing the need for post-processing in CC-based text detection.

Findings

01

Achieves state-of-the-art accuracy on benchmarks.

02

Eliminates post-processing in scene text detection.

03

Maintains high inference speed.

Abstract

Connected component (CC) is a proper text shape representation that aligns with human reading intuition. However, CC-based text detection methods have recently faced a developmental bottleneck that their time-consuming post-processing is difficult to eliminate. To address this issue, we introduce an explicit relational reasoning network (ERRNet) to elegantly model the component relationships without post-processing. Concretely, we first represent each text instance as multiple ordered text components, and then treat these components as objects in sequential movement. In this way, scene text detection can be innovatively viewed as a tracking problem. From this perspective, we design an end-to-end tracking decoder to achieve a CC-based method dispensing with post-processing entirely. Additionally, we observe that there is an inconsistency between classification confidence and localization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Explicit Relational Reasoning Network for Scene Text Detection· underline

Taxonomy

TopicsAdvanced Text Analysis Techniques · Semantic Web and Ontologies · Rough Sets and Fuzzy Logic