Orientation-Independent Chinese Text Recognition in Scene Images
Haiyang Yu, Xiaocong Wang, Bin Li, Xiangyang Xue

TL;DR
This paper proposes a novel method for recognizing Chinese text in natural scene images that is robust to orientation variations by disentangling content and orientation features, significantly improving recognition accuracy.
Contribution
It introduces a Character Image Reconstruction Network (CIRN) that extracts orientation-independent features, enabling robust recognition of both horizontal and vertical Chinese texts in scenes.
Findings
Achieved 45.63% improvement on VCTR dataset with CIRN.
Demonstrated effective disentangling of content and orientation information.
Improved robustness of Chinese text recognition in natural scenes.
Abstract
Scene text recognition (STR) has attracted much attention due to its broad applications. The previous works pay more attention to dealing with the recognition of Latin text images with complex backgrounds by introducing language models or other auxiliary networks. Different from Latin texts, many vertical Chinese texts exist in natural scenes, which brings difficulties to current state-of-the-art STR methods. In this paper, we take the first attempt to extract orientation-independent visual features by disentangling content and orientation information of text images, thus recognizing both horizontal and vertical texts robustly in natural scenes. Specifically, we introduce a Character Image Reconstruction Network (CIRN) to recover corresponding printed character images with disentangled content and orientation information. We conduct experiments on a scene dataset for benchmarking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques
