MT: Multi-Perspective Feature Learning Network for Scene Text Detection
Chuang Yang, Mulin Chen, Yuan Yuan, and Qi Wang

TL;DR
This paper introduces MT, a fast and accurate scene text detection network that effectively detects arbitrary-shaped texts using a lightweight framework, multi-perspective features, and a novel IoU minimization loss.
Contribution
The paper presents a novel multi-perspective feature learning network with a lightweight design and a new loss function for improved scene text detection.
Findings
Outperforms state-of-the-art methods on four datasets
Achieves high detection accuracy with fast inference
Effectively detects arbitrary-shaped texts
Abstract
Text detection, the key technology for understanding scene text, has become an attractive research topic. For detecting various scene texts, researchers propose plenty of detectors with different advantages: detection-based models enjoy fast detection speed, and segmentation-based algorithms are not limited by text shapes. However, for most intelligent systems, the detector needs to detect arbitrary-shaped texts with high speed and accuracy simultaneously. Thus, in this study, we design an efficient pipeline named as MT, which can detect adhesive arbitrary-shaped texts with only a single binary mask in the inference stage. This paper presents the contributions on three aspects: (1) a light-weight detection framework is designed to speed up the inference process while keeping high detection accuracy; (2) a multi-perspective feature module is proposed to learn more discriminative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Advanced Image and Video Retrieval Techniques
