Towards Unconstrained End-to-End Text Spotting

Siyang Qin; Alessandro Bissacco; Michalis Raptis; Yasuhisa Fujii; Ying; Xiao

arXiv:1908.09231·cs.CV·August 27, 2019·23 cites

Towards Unconstrained End-to-End Text Spotting

Siyang Qin, Alessandro Bissacco, Michalis Raptis, Yasuhisa Fujii, Ying, Xiao

PDF

Open Access

TL;DR

This paper introduces an end-to-end trainable network capable of detecting and recognizing arbitrarily shaped scene text, significantly advancing the ability to read irregular text in images.

Contribution

It formulates irregular shape text detection as an instance segmentation problem and employs an attention model for recognition without rectification, improving accuracy on benchmarks.

Findings

01

Surpassed state-of-the-art on ICDAR15 by 4.6%.

02

Achieved over 16% improvement on Total-Text.

03

Introduced RoI masking for irregular text feature extraction.

Abstract

We propose an end-to-end trainable network that can simultaneously detect and recognize text of arbitrary shape, making substantial progress on the open problem of reading scene text of irregular shape. We formulate arbitrary shape text detection as an instance segmentation problem; an attention model is then used to decode the textual content of each irregularly shaped text region without rectification. To extract useful irregularly shaped text instance features from image scale features, we propose a simple yet effective RoI masking step. Additionally, we show that predictions from an existing multi-step OCR engine can be leveraged as partially labeled training data, which leads to significant improvements in both the detection and recognition accuracy of our model. Our method surpasses the state-of-the-art for end-to-end recognition tasks on the ICDAR15 (straight) benchmark by 4.6%,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Processing and 3D Reconstruction