Learning Surrogate Models of Document Image Quality Metrics for Automated Document Image Processing
Prashant Singh, Ekta Vats, Anders Hast

TL;DR
This paper introduces surrogate models that predict document image quality metrics without ground truth, enabling real-time quality assessment in unseen documents, which is useful for hyperparameter tuning in image processing.
Contribution
It presents a novel approach to learn surrogate models for document quality metrics, reducing reliance on ground truth images for unseen documents.
Findings
Surrogate models accurately predict quality metrics on unseen documents.
The approach enables on-the-fly quality assessment in image processing.
Empirical evaluation shows effectiveness on DIBCO and H-DIBCO datasets.
Abstract
Computation of document image quality metrics often depends upon the availability of a ground truth image corresponding to the document. This limits the applicability of quality metrics in applications such as hyperparameter optimization of image processing algorithms that operate on-the-fly on unseen documents. This work proposes the use of surrogate models to learn the behavior of a given document quality metric on existing datasets where ground truth images are available. The trained surrogate model can later be used to predict the metric value on previously unseen document images without requiring access to ground truth images. The surrogate model is empirically evaluated on the Document Image Binarization Competition (DIBCO) and the Handwritten Document Image Binarization Competition (H-DIBCO) datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
