Scene Text Recognition from Two-Dimensional Perspective

Minghui Liao; Jian Zhang; Zhaoyi Wan; Fengming Xie; Jiajun Liang,; Pengyuan Lyu; Cong Yao; Xiang Bai

arXiv:1809.06508·cs.CV·November 20, 2018·36 cites

Scene Text Recognition from Two-Dimensional Perspective

Minghui Liao, Jian Zhang, Zhaoyi Wan, Fengming Xie, Jiajun Liang,, Pengyuan Lyu, Cong Yao, Xiang Bai

PDF

Open Access

TL;DR

This paper introduces CA-FCN, a two-dimensional fully convolutional network for scene text recognition that effectively handles arbitrary-shaped text by leveraging semantic segmentation and attention mechanisms, outperforming previous methods.

Contribution

The paper proposes a novel 2D perspective for scene text recognition using CA-FCN, which improves accuracy and robustness over traditional sequence-based methods.

Findings

01

Outperforms previous methods on regular and irregular text datasets.

02

More robust to imprecise localizations in text detection.

03

Effective recognition of arbitrary-shaped text.

Abstract

Inspired by speech recognition, recent state-of-the-art algorithms mostly consider scene text recognition as a sequence prediction problem. Though achieving excellent performance, these methods usually neglect an important fact that text in images are actually distributed in two-dimensional space. It is a nature quite different from that of speech, which is essentially a one-dimensional signal. In principle, directly compressing features of text into a one-dimensional form may lose useful information and introduce extra noise. In this paper, we approach scene text recognition from a two-dimensional perspective. A simple yet effective model, called Character Attention Fully Convolutional Network (CA-FCN), is devised for recognizing the text of arbitrary shapes. Scene text recognition is realized with a semantic segmentation network, where an attention mechanism for characters is adopted.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Image Retrieval and Classification Techniques