HRCenterNet: An Anchorless Approach to Chinese Character Segmentation in   Historical Documents

Chia-Wei Tang; Chao-Lin Liu; Po-Sen Chiu

arXiv:2012.05739·cs.CV·April 6, 2021

HRCenterNet: An Anchorless Approach to Chinese Character Segmentation in Historical Documents

Chia-Wei Tang, Chao-Lin Liu, Po-Sen Chiu

PDF

1 Repo

TL;DR

HRCenterNet is a novel anchorless model designed for accurate and efficient segmentation of Chinese characters in historical documents, significantly improving segmentation performance on a large dataset.

Contribution

The paper introduces HRCenterNet, an anchorless, parallelized architecture for Chinese character segmentation, leveraging a new dataset and achieving state-of-the-art results.

Findings

01

Achieves IoU 0.81 on the MTHv2 dataset

02

Outperforms existing methods in speed-accuracy trade-off

03

Demonstrates effectiveness on over 3000 historical document images

Abstract

The information provided by historical documents has always been indispensable in the transmission of human civilization, but it has also made these books susceptible to damage due to various factors. Thanks to recent technology, the automatic digitization of these documents are one of the quickest and most effective means of preservation. The main steps of automatic text digitization can be divided into two stages, mainly: character segmentation and character recognition, where the recognition results depend largely on the accuracy of segmentation. Therefore, in this study, we will only focus on the character segmentation of historical Chinese documents. In this research, we propose a model named HRCenterNet, which is combined with an anchorless object detection method and parallelized architecture. The MTHv2 dataset consists of over 3000 Chinese historical document images and over 1…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Tverous/HRCenterNet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.