GDB: Gated convolutions-based Document Binarization
Zongyuan Yang, Yongping Xiong, Guibin Wu

TL;DR
This paper introduces GDB, a gated convolutions-based neural network for document binarization that enhances stroke edge extraction by selectively focusing on relevant features, outperforming existing methods across multiple datasets.
Contribution
The paper proposes a novel end-to-end gated convolutions framework with a two-stage process and multi-scale features for improved stroke edge extraction in document binarization.
Findings
Outperforms state-of-the-art methods on ten DIBCO datasets.
Achieves top ranking on six benchmark datasets.
Effectively extracts fine stroke edges with gating mechanisms.
Abstract
Document binarization is a key pre-processing step for many document analysis tasks. However, existing methods can not extract stroke edges finely, mainly due to the fair-treatment nature of vanilla convolutions and the extraction of stroke edges without adequate supervision by boundary-related information. In this paper, we formulate text extraction as the learning of gating values and propose an end-to-end gated convolutions-based network (GDB) to solve the problem of imprecise stroke edge extraction. The gated convolutions are applied to selectively extract the features of strokes with different attention. Our proposed framework consists of two stages. Firstly, a coarse sub-network with an extra edge branch is trained to get more precise feature maps by feeding a priori mask and edge. Secondly, a refinement sub-network is cascaded to refine the output of the first stage by gated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Text and Document Classification Technologies
MethodsMulti Loss ( BCE Loss + Focal Loss ) + Dice Loss · Dogecoin Customer Service Number +1-833-534-1729 · Gated Convolution
