Inverse-like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling
Shi-Xue Zhang, Chun Yang, Xiaobin Zhu, Hongyang Zhou, Hongfa Wang,, Xu-Cheng Yin

TL;DR
This paper introduces IATS, an end-to-end framework for scene text spotting that effectively handles inverse-like texts with complex layouts by estimating reading order and dynamically sampling features, improving accuracy on challenging datasets.
Contribution
The paper proposes a novel reading-order estimation module and a dynamic sampling module within an end-to-end trainable framework for inverse-like scene text spotting, addressing complex text layouts.
Findings
Achieves superior performance on inverse-like and irregular scene text datasets.
Effectively handles mirrored, symmetrical, and retro-flexed texts.
Outperforms existing methods in challenging text spotting scenarios.
Abstract
Scene text spotting is a challenging task, especially for inverse-like scene text, which has complex layouts, e.g., mirrored, symmetrical, or retro-flexed. In this paper, we propose a unified end-to-end trainable inverse-like antagonistic text spotting framework dubbed IATS, which can effectively spot inverse-like scene texts without sacrificing general ones. Specifically, we propose an innovative reading-order estimation module (REM) that extracts reading-order information from the initial text boundary generated by an initial boundary module (IBM). To optimize and train REM, we propose a joint reading-order estimation loss consisting of a classification loss, an orthogonality loss, and a distribution loss. With the help of IBM, we can divide the initial text boundary into two symmetric control points and iteratively refine the new text boundary using a lightweight boundary refinement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Hand Gesture Recognition Systems
MethodsConvolution · Q-Learning · Dense Connections · Deep Q-Network · Random Ensemble Mixture
