What Shape Is Optimal for Masks in Text Removal?
Hyakka Nakada, Marika Kubota

TL;DR
This paper investigates the optimal mask shapes for text removal in images, introducing a Bayesian optimization method to learn flexible, character-wise masks, and providing practical guidelines for manual masking in complex scenarios.
Contribution
It proposes a novel Bayesian optimization approach to model and learn flexible mask profiles for improved text removal in complex images.
Findings
Character-wise masks are optimal for text removal.
Minimum cover masks are not always best.
Flexible mask profiles improve performance.
Abstract
The advent of generative models has dramatically improved the accuracy of image inpainting. In particular, by removing specific text from document images, reconstructing original images is extremely important for industrial applications. However, most existing methods of text removal focus on deleting simple scene text which appears in images captured by a camera in an outdoor environment. There is little research dedicated to complex and practical images with dense text. Therefore, we created benchmark data for text removal from images including a large amount of text. From the data, we found that text-removal performance becomes vulnerable against mask profile perturbation. Thus, for practical text-removal tasks, precise tuning of the mask shape is essential. This study developed a method to model highly flexible mask profiles and learn their parameters using Bayesian optimization.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Handwritten Text Recognition Techniques · Computer Graphics and Visualization Techniques
