Recognition of Handwritten Chinese Text by Segmentation: A   Segment-annotation-free Approach

Dezhi Peng; Lianwen Jin; Weihong Ma; Canyu Xie; Hesuo Zhang; Shenggao; Zhu; Jing Li

arXiv:2207.14801·cs.CV·August 1, 2022

Recognition of Handwritten Chinese Text by Segmentation: A Segment-annotation-free Approach

Dezhi Peng, Lianwen Jin, Weihong Ma, Canyu Xie, Hesuo Zhang, Shenggao, Zhu, Jing Li

PDF

TL;DR

This paper introduces a segmentation-based Chinese handwritten text recognition method that uses weak supervision and contextual regularization, outperforming existing segmentation-free methods in accuracy and speed.

Contribution

It presents a novel segmentation-based approach with weakly supervised training and contextual regularization, challenging the dominance of segmentation-free methods in HCTR.

Findings

01

Significantly outperforms existing methods on multiple benchmarks.

02

Achieves higher inference speed than CTC and attention-based approaches.

03

Effectively integrates contextual information during training.

Abstract

Online and offline handwritten Chinese text recognition (HTCR) has been studied for decades. Early methods adopted oversegmentation-based strategies but suffered from low speed, insufficient accuracy, and high cost of character segmentation annotations. Recently, segmentation-free methods based on connectionist temporal classification (CTC) and attention mechanism, have dominated the field of HCTR. However, people actually read text character by character, especially for ideograms such as Chinese. This raises the question: are segmentation-free strategies really the best solution to HCTR? To explore this issue, we propose a new segmentation-based method for recognizing handwritten Chinese text that is implemented using a simple yet efficient fully convolutional network. A novel weakly supervised learning method is proposed to enable the network to be trained using only transcript…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings