Efficient Document Image Dewarping via Hybrid Deep Learning and Cubic Polynomial Geometry Restoration
Valery Istomin, Oleg Pereziabov, Ilya Afanasyev

TL;DR
This paper introduces a hybrid dewarping method combining deep learning for document detection with classical geometry restoration techniques, achieving high accuracy and efficiency in correcting distortions for improved OCR performance.
Contribution
The study presents a novel hybrid approach that integrates deep learning and classical computer vision for document dewarping, offering superior accuracy and efficiency over existing methods.
Findings
Achieves the lowest median CER of 0.0235, indicating high OCR accuracy.
Outperforms state-of-the-art methods in geometry restoration quality.
Requires less computational resources than pure deep learning solutions.
Abstract
Camera-captured document images often suffer from geometric distortions caused by paper deformation, perspective distortion, and lens aberrations, significantly reducing OCR accuracy. This study develops an efficient automated method for document image dewarping that balances accuracy with computational efficiency. We propose a hybrid approach combining deep learning for document detection with classical computer vision for geometry restoration. YOLOv8 performs initial document segmentation and mask generation. Subsequently, classical CV techniques construct a topological 2D grid through cubic polynomial interpolation of document boundaries, followed by image remapping to correct nonlinear distortions. A new annotated dataset and open-source framework are provided to facilitate reproducibility and further research. Experimental evaluation against state-of-the-art methods (RectiNet,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Handwritten Text Recognition Techniques · Image Retrieval and Classification Techniques
