Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection
Shi-Xue Zhang, Xiaobin Zhu, Chun Yang, Hongfa Wang, Xu-Cheng Yin

TL;DR
This paper introduces an adaptive boundary proposal network that accurately detects arbitrary shape text in images by directly generating boundaries without post-processing, combining dilated convolutions, GCN, and RNN.
Contribution
The novel adaptive boundary deformation model effectively improves arbitrary shape text detection by integrating GCN and RNN for boundary refinement without complex post-processing.
Findings
Achieves state-of-the-art results on public datasets.
Effectively models complex text shapes with boundary deformation.
Reduces reliance on post-processing steps.
Abstract
Arbitrary shape text detection is a challenging task due to the high complexity and variety of scene texts. In this work, we propose a novel adaptive boundary proposal network for arbitrary shape text detection, which can learn to directly produce accurate boundary for arbitrary shape text without any post-processing. Our method mainly consists of a boundary proposal model and an innovative adaptive boundary deformation model. The boundary proposal model constructed by multi-layer dilated convolutions is adopted to produce prior information (including classification map, distance field, and direction field) and coarse boundary proposals. The adaptive boundary deformation model is an encoder-decoder network, in which the encoder mainly consists of a Graph Convolutional Network (GCN) and a Recurrent Neural Network (RNN). It aims to perform boundary deformation in an iterative way for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Image Processing and 3D Reconstruction
