DeepText: A Unified Framework for Text Proposal Generation and Text Detection in Natural Images
Zhuoyao Zhong, Lianwen Jin, Shuye Zhang, Ziyong Feng

TL;DR
DeepText introduces a unified CNN-based framework for text proposal generation and detection in natural images, significantly improving recall and accuracy over previous methods.
Contribution
The paper presents a novel unified CNN framework with inception-RPN and multilevel pooling for efficient text detection and high recall in natural images.
Findings
Achieves F-measure of 0.83 on ICDAR 2011
Achieves F-measure of 0.85 on ICDAR 2013
Outperforms previous state-of-the-art methods
Abstract
In this paper, we develop a novel unified framework called DeepText for text region proposal generation and text detection in natural images via a fully convolutional neural network (CNN). First, we propose the inception region proposal network (Inception-RPN) and design a set of text characteristic prior bounding boxes to achieve high word recall with only hundred level candidate proposals. Next, we present a powerful textdetection network that embeds ambiguous text category (ATC) information and multilevel region-of-interest pooling (MLRP) for text and non-text classification and accurate localization. Finally, we apply an iterative bounding box voting scheme to pursue high recall in a complementary manner and introduce a filtering algorithm to retain the most suitable bounding box, while removing redundant inner and outer boxes for each text instance. Our approach achieves an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques
