# Learning to Clean: A GAN Perspective

**Authors:** Monika Sharma, Abhishek Verma, Lovekesh Vig

arXiv: 1901.11382 · 2019-02-01

## TL;DR

This paper investigates using GANs, particularly CycleGAN, to clean noisy scanned documents for improved text recognition, addressing the challenge of training with unpaired data and demonstrating robust performance in various noise conditions.

## Contribution

The paper introduces the application of CycleGAN for document denoising with unpaired data, comparing it to paired-data methods, and shows its robustness across different noise types.

## Key findings

- CycleGAN effectively learns to denoise documents with unpaired data.
- CycleGAN outperforms conditional GAN in robustness and quality.
- Unpaired training enables practical denoising without paired datasets.

## Abstract

In the big data era, the impetus to digitize the vast reservoirs of data trapped in unstructured scanned documents such as invoices, bank documents and courier receipts has gained fresh momentum. The scanning process often results in the introduction of artifacts such as background noise, blur due to camera motion, watermarkings, coffee stains, or faded text. These artifacts pose many readability challenges to current text recognition algorithms and significantly degrade their performance. Existing learning based denoising techniques require a dataset comprising of noisy documents paired with cleaned versions. In such scenarios, a model can be trained to generate clean documents from noisy versions. However, very often in the real world such a paired dataset is not available, and all we have for training our denoising model are unpaired sets of noisy and clean images. This paper explores the use of GANs to generate denoised versions of the noisy documents. In particular, where paired information is available, we formulate the problem as an image-to-image translation task i.e, translating a document from noisy domain ( i.e., background noise, blurred, faded, watermarked ) to a target clean document using Generative Adversarial Networks (GAN). However, in the absence of paired images for training, we employed CycleGAN which is known to learn a mapping between the distributions of the noisy images to the denoised images using unpaired data to achieve image-to-image translation for cleaning the noisy documents. We compare the performance of CycleGAN for document cleaning tasks using unpaired images with a Conditional GAN trained on paired data from the same dataset. Experiments were performed on a public document dataset on which different types of noise were artificially induced, results demonstrate that CycleGAN learns a more robust mapping from the space of noisy to clean documents.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.11382/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1901.11382/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1901.11382/full.md

---
Source: https://tomesphere.com/paper/1901.11382