TPFNet: A Novel Text In-painting Transformer for Text Removal

Onkar Susladkar; Dhruv Makwana; Gayatri Deshmukh; Sparsh Mittal; Sai; Chandra Teja R; Rekha Singhal

arXiv:2210.14461·cs.CV·October 28, 2022

TPFNet: A Novel Text In-painting Transformer for Text Removal

Onkar Susladkar, Dhruv Makwana, Gayatri Deshmukh, Sparsh Mittal, Sai, Chandra Teja R, Rekha Singhal

PDF

Open Access 1 Repo

TL;DR

TPFNet is an innovative end-to-end transformer-based network that effectively removes text from images by combining feature synthesis, segmentation, and high-pass filtering, outperforming previous methods on multiple datasets.

Contribution

Introduces TPFNet, a novel one-stage transformer-based architecture with a multi-headed decoder and segmentation guidance for precise text removal from images.

Findings

01

Outperforms recent methods on Oxford, SCUT, and SCUT-EnsText datasets.

02

Achieves higher PSNR and lower text detection precision, indicating better quality and accuracy.

03

Utilizes a pyramidal vision transformer and adversarial loss conditioned on segmentation for improved results.

Abstract

Text erasure from an image is helpful for various tasks such as image editing and privacy preservation. In this paper, we present TPFNet, a novel one-stage (end-toend) network for text removal from images. Our network has two parts: feature synthesis and image generation. Since noise can be more effectively removed from low-resolution images, part 1 operates on low-resolution images. The output of part 1 is a low-resolution text-free image. Part 2 uses the features learned in part 1 to predict a high-resolution text-free image. In part 1, we use "pyramidal vision transformer" (PVT) as the encoder. Further, we use a novel multi-headed decoder that generates a high-pass filtered image and a segmentation map, in addition to a text-free image. The segmentation branch helps locate the text precisely, and the high-pass branch helps in learning the image structure. To precisely locate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

candlelabai/tpfnet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Handwritten Text Recognition Techniques