Deep Direct Regression for Multi-Oriented Scene Text Detection

Wenhao He; Xu-Yao Zhang; Fei Yin; Cheng-Lin Liu

arXiv:1703.08289·cs.CV·March 27, 2017·54 cites

Deep Direct Regression for Multi-Oriented Scene Text Detection

Wenhao He, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu

PDF

Open Access

TL;DR

This paper introduces a deep direct regression approach for multi-oriented scene text detection, outperforming existing methods by simplifying the detection process and achieving state-of-the-art results on multiple benchmarks.

Contribution

The paper proposes a novel deep direct regression framework for scene text detection, emphasizing end-to-end training and a fully convolutional network for improved accuracy.

Findings

01

Achieves 81% F1 on ICDAR2015 benchmark.

02

Outperforms previous methods on multiple datasets.

03

Simplifies detection with a one-step, fully convolutional approach.

Abstract

In this paper, we first provide a new perspective to divide existing high performance object detection methods into direct and indirect regressions. Direct regression performs boundary regression by predicting the offsets from a given point, while indirect regression predicts the offsets from some bounding box proposals. Then we analyze the drawbacks of the indirect regression, which the recent state-of-the-art detection structures like Faster-RCNN and SSD follows, for multi-oriented scene text detection, and point out the potential superiority of direct regression. To verify this point of view, we propose a deep direct regression based method for multi-oriented scene text detection. Our detection framework is simple and effective with a fully convolutional network and one-step post processing. The fully convolutional network is optimized in an end-to-end way and has bi-task outputs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Advanced Image and Video Retrieval Techniques · Image Processing and 3D Reconstruction