LRANet: Towards Accurate and Efficient Scene Text Detection with   Low-Rank Approximation Network

Yuchen Su; Zhineng Chen; Zhiwen Shao; Yuning Du; Zhilong Ji; Jinfeng; Bai; Yong Zhou; Yu-Gang Jiang

arXiv:2306.15142·cs.CV·January 25, 2024

LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network

Yuchen Su, Zhineng Chen, Zhiwen Shao, Yuning Du, Zhilong Ji, Jinfeng, Bai, Yong Zhou, Yu-Gang Jiang

PDF

Open Access 2 Repos 1 Video

TL;DR

LRANet introduces a low-rank approximation-based shape modeling and a dual assignment scheme to enhance the accuracy and speed of scene text detection, especially for arbitrary-shaped texts.

Contribution

The paper proposes a novel low-rank approximation method for text shape representation and a dual assignment scheme to improve detection accuracy and inference speed.

Findings

01

Achieves superior accuracy on challenging benchmarks.

02

Demonstrates significant speed improvements over existing methods.

03

Robust in modeling arbitrary-shaped texts.

Abstract

Recently, regression-based methods, which predict parameterized text shapes for text localization, have gained popularity in scene text detection. However, the existing parameterized text shape methods still have limitations in modeling arbitrary-shaped texts due to ignoring the utilization of text-specific shape information. Moreover, the time consumption of the entire pipeline has been largely overlooked, leading to a suboptimal overall inference speed. To address these issues, we first propose a novel parameterized text shape method based on low-rank approximation. Unlike other shape representation methods that employ data-irrelevant parameterization, our approach utilizes singular value decomposition and reconstructs the text shape using a few eigenvectors learned from labeled text contours. By exploring the shape correlation among different text contours, our method achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network· underline

Taxonomy

TopicsHandwritten Text Recognition Techniques · Text and Document Classification Technologies · Image Retrieval and Classification Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings