TL;DR
UVDoc introduces a neural grid-based approach for unwarping photographed documents, utilizing a new pseudo-photorealistic dataset to improve training and achieve state-of-the-art results in document correction.
Contribution
The paper presents a novel dual-task neural network for document unwarping and introduces UVDoc, a realistic dataset for training and evaluating such models.
Findings
Achieves state-of-the-art results on DocUNet benchmark.
Demonstrates the effectiveness of pseudo-photorealistic data for training.
Provides new evaluation metrics for document unwarping quality.
Abstract
Restoring the original, flat appearance of a printed document from casual photographs of bent and wrinkled pages is a common everyday problem. In this paper we propose a novel method for grid-based single-image document unwarping. Our method performs geometric distortion correction via a fully convolutional deep neural network that learns to predict the 3D grid mesh of the document and the corresponding 2D unwarping grid in a dual-task fashion, implicitly encoding the coupling between the shape of a 3D piece of paper and its 2D image. In order to allow unwarping models to train on data that is more realistic in appearance than the commonly used synthetic Doc3D dataset, we create and publish our own dataset, called UVDoc, which combines pseudo-photorealistic document images with physically accurate 3D shape and unwarping function annotations. Our dataset is labeled with all the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
