Document Image Binarization in JPEG Compressed Domain using Dual Discriminator Generative Adversarial Networks
Bulla Rajesh, Manav Kamlesh Agrawal, Milan Bhuva, Kisalaya, Kishore, Mohammed Javed

TL;DR
This paper introduces a novel method for document image binarization directly in the JPEG compressed domain using Dual Discriminator GANs, achieving state-of-the-art results without full image decompression.
Contribution
The paper proposes a dual discriminator GAN framework that binarizes JPEG compressed images directly, reducing computational costs and improving robustness over existing pixel-based methods.
Findings
Achieved state-of-the-art performance on DIBCO datasets.
Demonstrated robustness to noisy and degraded document images.
Reduced processing time and memory usage compared to traditional methods.
Abstract
Image binarization techniques are being popularly used in enhancement of noisy and/or degraded images catering different Document Image Anlaysis (DIA) applications like word spotting, document retrieval, and OCR. Most of the existing techniques focus on feeding pixel images into the Convolution Neural Networks to accomplish document binarization, which may not produce effective results when working with compressed images that need to be processed without full decompression. Therefore in this research paper, the idea of document image binarization directly using JPEG compressed stream of document images is proposed by employing Dual Discriminator Generative Adversarial Networks (DD-GANs). Here the two discriminator networks - Global and Local work on different image ratios and use focal loss as generator loss. The proposed model has been thoroughly tested with different versions of DIBCO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Handwritten Text Recognition Techniques · Vehicle License Plate Recognition
MethodsFocal Loss · Convolution
