Text-Attentional Convolutional Neural Networks for Scene Text Detection
Tong He, Weilin Huang, Yu Qiao, and Jian Yao

TL;DR
This paper introduces a novel Text-Attentional CNN for scene text detection that leverages rich supervision and a contrast-enhanced region detector to improve robustness and accuracy in challenging backgrounds.
Contribution
The work proposes a new Text-CNN with multi-level supervision and a contrast-enhanced detector, significantly advancing scene text detection performance.
Findings
Achieved a F-measure of 0.82 on ICDAR 2013 dataset
Enhanced robustness against complex backgrounds
Improved state-of-the-art detection recall
Abstract
Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature computed globally from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this work, we present a new system for scene text detection by proposing a novel Text-Attentional Convolutional Neural Network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/nontext information. The rich supervision information enables the Text-CNN with a strong capability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
