Efficient Document Image Dewarping via Hybrid Deep Learning and Cubic Polynomial Geometry Restoration

Valery Istomin; Oleg Pereziabov; Ilya Afanasyev

arXiv:2501.03145·cs.CV·November 20, 2025

Efficient Document Image Dewarping via Hybrid Deep Learning and Cubic Polynomial Geometry Restoration

Valery Istomin, Oleg Pereziabov, Ilya Afanasyev

PDF

Open Access 1 Repo

TL;DR

This paper introduces a hybrid dewarping method combining deep learning for document detection with classical geometry restoration techniques, achieving high accuracy and efficiency in correcting distortions for improved OCR performance.

Contribution

The study presents a novel hybrid approach that integrates deep learning and classical computer vision for document dewarping, offering superior accuracy and efficiency over existing methods.

Findings

01

Achieves the lowest median CER of 0.0235, indicating high OCR accuracy.

02

Outperforms state-of-the-art methods in geometry restoration quality.

03

Requires less computational resources than pure deep learning solutions.

Abstract

Camera-captured document images often suffer from geometric distortions caused by paper deformation, perspective distortion, and lens aberrations, significantly reducing OCR accuracy. This study develops an efficient automated method for document image dewarping that balances accuracy with computational efficiency. We propose a hybrid approach combining deep learning for document detection with classical computer vision for geometry restoration. YOLOv8 performs initial document segmentation and mask generation. Subsequently, classical CV techniques construct a topological 2D grid through cubic polynomial interpolation of document boundaries, followed by image remapping to correct nonlinear distortions. A new annotated dataset and open-source framework are provided to facilitate reproducibility and further research. Experimental evaluation against state-of-the-art methods (RectiNet,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

horizonparadox/drccbi
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Handwritten Text Recognition Techniques · Image Retrieval and Classification Techniques