Detecting Text in Natural Image with Connectionist Text Proposal Network

Zhi Tian; Weilin Huang; Tong He; Pan He; and Yu Qiao

arXiv:1609.03605·cs.CV·October 2, 2016·19 cites

Detecting Text in Natural Image with Connectionist Text Proposal Network

Zhi Tian, Weilin Huang, Tong He, Pan He, and Yu Qiao

PDF

Open Access 5 Repos

TL;DR

The paper introduces a Connectionist Text Proposal Network (CTPN) that accurately detects text lines in natural images using a novel, end-to-end trainable deep learning model that integrates convolutional features with sequential proposals.

Contribution

It presents the CTPN model which combines a vertical anchor mechanism with a recurrent neural network for precise, multi-scale, multi-language text detection without complex post-processing.

Findings

01

Achieves high F-measure on ICDAR benchmarks

02

Operates efficiently at 0.14 seconds per image

03

Outperforms recent state-of-the-art methods

Abstract

We propose a novel Connectionist Text Proposal Network (CTPN) that accurately localizes text lines in natural image. The CTPN detects a text line in a sequence of fine-scale text proposals directly in convolutional feature maps. We develop a vertical anchor mechanism that jointly predicts location and text/non-text score of each fixed-width proposal, considerably improving localization accuracy. The sequential proposals are naturally connected by a recurrent neural network, which is seamlessly incorporated into the convolutional network, resulting in an end-to-end trainable model. This allows the CTPN to explore rich context information of image, making it powerful to detect extremely ambiguous text. The CTPN works reliably on multi-scale and multi- language text without further post-processing, departing from previous bottom-up methods requiring multi-step post-processing. It achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques