Inverse-like Antagonistic Scene Text Spotting via Reading-Order   Estimation and Dynamic Sampling

Shi-Xue Zhang; Chun Yang; Xiaobin Zhu; Hongyang Zhou; Hongfa Wang,; Xu-Cheng Yin

arXiv:2401.03637·cs.CV·January 9, 2024·1 cites

Inverse-like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling

Shi-Xue Zhang, Chun Yang, Xiaobin Zhu, Hongyang Zhou, Hongfa Wang,, Xu-Cheng Yin

PDF

Open Access

TL;DR

This paper introduces IATS, an end-to-end framework for scene text spotting that effectively handles inverse-like texts with complex layouts by estimating reading order and dynamically sampling features, improving accuracy on challenging datasets.

Contribution

The paper proposes a novel reading-order estimation module and a dynamic sampling module within an end-to-end trainable framework for inverse-like scene text spotting, addressing complex text layouts.

Findings

01

Achieves superior performance on inverse-like and irregular scene text datasets.

02

Effectively handles mirrored, symmetrical, and retro-flexed texts.

03

Outperforms existing methods in challenging text spotting scenarios.

Abstract

Scene text spotting is a challenging task, especially for inverse-like scene text, which has complex layouts, e.g., mirrored, symmetrical, or retro-flexed. In this paper, we propose a unified end-to-end trainable inverse-like antagonistic text spotting framework dubbed IATS, which can effectively spot inverse-like scene texts without sacrificing general ones. Specifically, we propose an innovative reading-order estimation module (REM) that extracts reading-order information from the initial text boundary generated by an initial boundary module (IBM). To optimize and train REM, we propose a joint reading-order estimation loss consisting of a classification loss, an orthogonality loss, and a distribution loss. With the help of IBM, we can divide the initial text boundary into two symmetric control points and iteratively refine the new text boundary using a lightweight boundary refinement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Hand Gesture Recognition Systems

MethodsConvolution · Q-Learning · Dense Connections · Deep Q-Network · Random Ensemble Mixture