TL;DR
This paper introduces a two-stage, corner-based region proposal method for detecting multi-oriented text in scenes, eliminating the need for prior shape knowledge and improving robustness through a novel pooling layer.
Contribution
It proposes a novel corner-based region proposal approach with adaptive quadrilaterals and a Dual-RoI Pooling layer for robust multi-oriented text detection.
Findings
Achieves comparable performance with state-of-the-art methods on benchmarks.
Effectively handles various text orientations and aspect ratios.
Demonstrates robustness through integrated data augmentation.
Abstract
Previous approaches for scene text detection usually rely on manually defined sliding windows. This work presents an intuitive two-stage region-based method to detect multi-oriented text without any prior knowledge regarding the textual shape. In the first stage, we estimate the possible locations of text instances by detecting and linking corners instead of shifting a set of default anchors. The quadrilateral proposals are geometry adaptive, which allows our method to cope with various text aspect ratios and orientations. In the second stage, we design a new pooling layer named Dual-RoI Pooling which embeds data augmentation inside the region-wise subnetwork for more robust classification and regression over these proposals. Experimental results on public benchmarks confirm that the proposed method is capable of achieving comparable performance with state-of-the-art methods. The code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
