Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning
Xugong Qin, Pengyuan Lyu, Chengquan Zhang, Yu Zhou, Kun Yao, Peng, Zhang, Hailun Lin, Weiping Wang

TL;DR
This paper introduces a novel representation learning approach for real-time scene text detection, combining semantic contrast and top-down modeling to improve robustness and accuracy without extra inference costs.
Contribution
It proposes global-dense semantic contrast and top-down modeling techniques to enhance encoder robustness in bottom-up segmentation-based scene text detection.
Findings
Achieves 87.2% F-measure at 48.2 FPS on Total-Text
Outperforms or matches state-of-the-art accuracy and speed
No additional parameters during inference
Abstract
Due to the flexible representation of arbitrary-shaped scene text and simple pipeline, bottom-up segmentation-based methods begin to be mainstream in real-time scene text detection. Despite great progress, these methods show deficiencies in robustness and still suffer from false positives and instance adhesion. Different from existing methods which integrate multiple-granularity features or multiple outputs, we resort to the perspective of representation learning in which auxiliary tasks are utilized to enable the encoder to jointly learn robust features with the main task of per-pixel classification during optimization. For semantic representation learning, we propose global-dense semantic contrast (GDSC), in which a vector is extracted for global semantic representation, then used to perform element-wise contrast with the dense grid features. To learn instance-aware representation, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Retrieval and Classification Techniques
