Chinese/English mixed Character Segmentation as Semantic Segmentation
Huabin Zheng, Jingyu Wang, Zhengjie Huang, Yang Yang, Rong Pan

TL;DR
This paper presents a novel approach to multilingual character segmentation by framing it as a semantic segmentation task using fully convolutional networks, effectively handling Chinese/English mixed text with high accuracy.
Contribution
The work introduces a deep learning-based method for multilingual character segmentation, specifically addressing Chinese/English mixed cases with an FCN architecture, outperforming previous techniques.
Findings
Model generalizes well to real-world samples
Significantly outperforms previous methods
Effective for Chinese/English mixed text segmentation
Abstract
OCR character segmentation for multilingual printed documents is difficult due to the diversity of different linguistic characters. Previous approaches mainly focus on monolingual texts and are not suitable for multilingual-lingual cases. In this work, we particularly tackle the Chinese/English mixed case by reframing it as a semantic segmentation problem. We take advantage of the successful architecture called fully convolutional networks (FCN) in the field of semantic segmentation. Given a wide enough receptive field, FCN can utilize the necessary context around a horizontal position to determinate whether this is a splitting point or not. As a deep neural architecture, FCN can automatically learn useful features from raw text line images. Although trained on synthesized samples with simulated random disturbance, our FCN model generalizes well to real-world samples. The experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Natural Language Processing Techniques
MethodsMax Pooling · Convolution · Fully Convolutional Network
