RectiNet-v2: A stacked network architecture for document image dewarping
Hmrishav Bandyopadhyay, Tanmoy Dasgupta, Nibaran Das, Mita Nasipuri

TL;DR
RectiNet-v2 introduces a novel CNN architecture for effectively dewarping document images, improving recognition accuracy by removing distortions through innovative network design and training on synthetic data.
Contribution
The paper presents a new end-to-end CNN with a bifurcated decoder, residual U-Net connections, and a gated network for improved document dewarping, trained on synthetic data.
Findings
Achieves results comparable to state-of-the-art on DocUNet dataset
Effectively removes perspective distortions from warped documents
Introduces architectural innovations for better feature flow and focus
Abstract
With the advent of mobile and hand-held cameras, document images have found their way into almost every domain. Dewarping of these images for the removal of perspective distortions and folds is essential so that they can be understood by document recognition algorithms. For this, we propose an end-to-end CNN architecture that can produce distortion free document images from warped documents it takes as input. We train this model on warped document images simulated synthetically to compensate for lack of enough natural data. Our method is novel in the use of a bifurcated decoder with shared weights to prevent intermingling of grid coordinates, in the use of residual networks in the U-Net skip connections to allow flow of data from different receptive fields in the model, and in the use of a gated network to help the model focus on structure and line level detail of the document image. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Handwritten Text Recognition Techniques · Advanced Image Processing Techniques
MethodsConcatenated Skip Connection · Max Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · U-Net
