Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks
Rui-Yang Ju, Yu-Shian Lin, Yanlin Jin, Chih-Chia Chen, Chun-Tse Chien, Jen-Shiun Chiang

TL;DR
This paper introduces a novel three-stage GAN-based method for enhancing and binarizing degraded color document images, effectively handling various types of manuscript degradation to improve text extraction accuracy.
Contribution
The work presents a new three-stage network combining discrete wavelet transform and GANs for color document image binarization, outperforming existing methods on multiple datasets.
Findings
Achieved state-of-the-art Avg-Score metrics on multiple DIBCO datasets.
Effectively handled diverse degradation types in ancient manuscripts.
Demonstrated improved text extraction in degraded color documents.
Abstract
The efficient extraction of text information from the background in degraded color document images is an important challenge in the preservation of ancient manuscripts. The imperfect preservation of ancient manuscripts has led to different types of degradation over time, such as page yellowing, staining, and ink bleeding, seriously affecting the results of document image binarization. This work proposes an effective three-stage network method to image enhancement and binarization of degraded documents using generative adversarial networks (GANs). Specifically, in Stage-1, we first split the input images into multiple patches, and then split these patches into four single-channel patch images (gray, red, green, and blue). Then, three single-channel patch images (red, green, and blue) are processed by the discrete wavelet transform (DWT) with normalization. In Stage-2, we use four…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Digital Media Forensic Detection · Image Processing and 3D Reconstruction
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Depthwise Convolution · Batch Normalization · Pointwise Convolution · Depthwise Separable Convolution · Sigmoid Activation · RMSProp · Squeeze-and-Excitation Block · (FiLe@Against@Claim)How do I file a claim against Expedia? · Dropout
