Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
Pengyuan Lyu, Minghui Liao, Cong Yao, Wenhao Wu, Xiang Bai

TL;DR
Mask TextSpotter is an end-to-end trainable neural network that effectively detects and recognizes irregularly shaped scene text, achieving state-of-the-art results in multiple benchmarks.
Contribution
It introduces Mask TextSpotter, a novel model inspired by Mask R-CNN, for simultaneous text detection and recognition of arbitrary-shaped text in natural images.
Findings
Achieves state-of-the-art results on ICDAR datasets.
Effectively handles curved and irregular text shapes.
Simplifies the end-to-end training process.
Abstract
Recently, models based on deep neural networks have dominated the fields of scene text detection and recognition. In this paper, we investigate the problem of scene text spotting, which aims at simultaneous text detection and recognition in natural images. An end-to-end trainable neural network model for scene text spotting is proposed. The proposed model, named as Mask TextSpotter, is inspired by the newly published work Mask R-CNN. Different from previous methods that also accomplish text spotting with end-to-end trainable deep neural networks, Mask TextSpotter takes advantage of simple and smooth end-to-end learning procedure, in which precise text detection and recognition are acquired via semantic segmentation. Moreover, it is superior to previous methods in handling text instances of irregular shapes, for example, curved text. Experiments on ICDAR2013, ICDAR2015 and Total-Text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Processing and 3D Reconstruction
MethodsRegion Proposal Network · Softmax · RoIAlign · Convolution · Mask R-CNN
