Model-based Iterative Restoration for Binary Document Image Compression with Dictionary Learning
Yandong Guo, Cheng Lu, Jan P. Allebach, and Charles A. Bouman

TL;DR
This paper introduces a Bayesian framework with dictionary learning for restoring noisy binary document images, which enhances both image quality and compression efficiency, outperforming existing methods on synthetic and real noisy images.
Contribution
It proposes a novel cost function for joint image restoration and dictionary learning, improving binary document image quality and compression ratio within the JBIG2 standard.
Findings
Reduces flipped pixels by 48.2% on synthetic noisy images
Improves compression ratio by 36.36% on synthetic noise
Outperforms state-of-the-art methods by 28.27% on real noisy images
Abstract
The inherent noise in the observed (e.g., scanned) binary document image degrades the image quality and harms the compression ratio through breaking the pattern repentance and adding entropy to the document images. In this paper, we design a cost function in Bayesian framework with dictionary learning. Minimizing our cost function produces a restored image which has better quality than that of the observed noisy image, and a dictionary for representing and encoding the image. After the restoration, we use this dictionary (from the same cost function) to encode the restored image following the symbol-dictionary framework by JBIG2 standard with the lossless mode. Experimental results with a variety of document images demonstrate that our method improves the image quality compared with the observed image, and simultaneously improves the compression ratio. For the test images with synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Advanced Image Processing Techniques · Advanced Data Compression Techniques
