TL;DR
This paper introduces a lightweight neural network for cleaning degraded document images on smartphones, utilizing perceptual loss to improve performance despite limited model capacity.
Contribution
The work presents a novel, resource-efficient CNN architecture with perceptual loss for document cleanup, significantly reducing model size and computational complexity.
Findings
Models are 65-1030 times smaller than existing methods.
Models require 3-27 times fewer product-sum operations.
Empirically effective on real-world benchmark datasets.
Abstract
Smartphones have enabled effortless capturing and sharing of documents in digital form. The documents, however, often undergo various types of degradation due to aging, stains, or shortcoming of capturing environment such as shadow, non-uniform lighting, etc., which reduces the comprehensibility of the document images. In this work, we consider the problem of document image cleanup on embedded applications such as smartphone apps, which usually have memory, energy, and latency limitations due to the device and/or for best human user experience. We propose a light-weight encoder decoder based convolutional neural network architecture for removing the noisy elements from document images. To compensate for generalization performance with a low network capacity, we incorporate the perceptual loss for knowledge transfer from pre-trained deep CNN network in our loss function. In terms of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
