DeQA-Doc: Adapting DeQA-Score to Document Image Quality Assessment

Junjie Gao; Runze Liu; Yingzhe Peng; Shujian Yang; Jin Zhang; Kai Yang; Zhiyuan You

arXiv:2507.12796·cs.CV·July 18, 2025

DeQA-Doc: Adapting DeQA-Score to Document Image Quality Assessment

Junjie Gao, Runze Liu, Yingzhe Peng, Shujian Yang, Jin Zhang, Kai Yang, Zhiyuan You

PDF

Open Access 3 Models

TL;DR

DeQA-Doc adapts a state-of-the-art multi-modal large language model-based image quality scorer to assess document image quality, achieving superior accuracy and robustness across various degradation types.

Contribution

This work introduces DeQA-Doc, a novel framework that extends DeQA-Score to document images using soft label strategies and ensemble methods for improved quality assessment.

Findings

01

DeQA-Doc outperforms existing baselines in accuracy.

02

It generalizes well across diverse degradation types.

03

Supports large resolution document images.

Abstract

Document quality assessment is critical for a wide range of applications including document digitization, OCR, and archival. However, existing approaches often struggle to provide accurate and robust quality scores, limiting their applicability in practical scenarios. With the rapid progress in Multi-modal Large Language Models (MLLMs), recent MLLM-based methods have achieved remarkable performance in image quality assessment. In this work, we extend this success to the document domain by adapting DeQA-Score, a state-of-the-art MLLM-based image quality scorer, for document quality assessment. We propose DeQA-Doc, a framework that leverages the visual language capabilities of MLLMs and a soft label strategy to regress continuous document quality scores. To adapt DeQA-Score to DeQA-Doc, we adopt two complementary solutions to construct soft labels without the variance information. Also,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Data Quality and Management · Semantic Web and Ontologies