Orientation-Independent Chinese Text Recognition in Scene Images

Haiyang Yu; Xiaocong Wang; Bin Li; Xiangyang Xue

arXiv:2309.01081·cs.CV·September 6, 2023

Orientation-Independent Chinese Text Recognition in Scene Images

Haiyang Yu, Xiaocong Wang, Bin Li, Xiangyang Xue

PDF

Open Access 1 Repo

TL;DR

This paper proposes a novel method for recognizing Chinese text in natural scene images that is robust to orientation variations by disentangling content and orientation features, significantly improving recognition accuracy.

Contribution

It introduces a Character Image Reconstruction Network (CIRN) that extracts orientation-independent features, enabling robust recognition of both horizontal and vertical Chinese texts in scenes.

Findings

01

Achieved 45.63% improvement on VCTR dataset with CIRN.

02

Demonstrated effective disentangling of content and orientation information.

03

Improved robustness of Chinese text recognition in natural scenes.

Abstract

Scene text recognition (STR) has attracted much attention due to its broad applications. The previous works pay more attention to dealing with the recognition of Latin text images with complex backgrounds by introducing language models or other auxiliary networks. Different from Latin texts, many vertical Chinese texts exist in natural scenes, which brings difficulties to current state-of-the-art STR methods. In this paper, we take the first attempt to extract orientation-independent visual features by disentangling content and orientation information of text images, thus recognizing both horizontal and vertical texts robustly in natural scenes. Specifically, we introduce a Character Image Reconstruction Network (CIRN) to recover corresponding printed character images with disentangled content and orientation information. We conduct experiments on a scene dataset for benchmarking…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fudanvi/fudanocr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques