Shape Robust Text Detection with Progressive Scale Expansion Network
Xiang Li, Wenhai Wang, Wenbo Hou, Ruo-Ze Liu, Tong Lu, Jian Yang

TL;DR
This paper introduces PSENet, a segmentation-based text detection method that effectively handles arbitrary shapes and close text instances by progressively expanding kernels from minimal to full text shapes, achieving state-of-the-art results.
Contribution
The paper proposes a novel Progressive Scale Expansion Network that improves shape robustness and instance separation in text detection through multi-scale kernel predictions and a progressive expansion algorithm.
Findings
Achieves state-of-the-art results on ICDAR 2015 and ICDAR 2017 MLT benchmarks.
Outperforms previous methods by 6.37% on the SCUT-CTW1500 dataset.
Effectively distinguishes adjacent text instances with arbitrary shapes.
Abstract
The challenges of shape robust text detection lie in two aspects: 1) most existing quadrangular bounding box based detectors are difficult to locate texts with arbitrary shapes, which are hard to be enclosed perfectly in a rectangle; 2) most pixel-wise segmentation-based detectors may not separate the text instances that are very close to each other. To address these problems, we propose a novel Progressive Scale Expansion Network (PSENet), designed as a segmentation-based detector with multiple predictions for each text instance. These predictions correspond to different `kernels' produced by shrinking the original text instance into various scales. Consequently, the final detection can be conducted through our progressive scale expansion algorithm which gradually expands the kernels with minimal scales to the text instances with maximal and complete shapes. Due to the fact that there…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Image Retrieval and Classification Techniques
