CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction

Liang Zhao; Qing Guo; Xiaoguang Li; and Song Wang

arXiv:2407.16204·cs.CV·July 24, 2024

CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction

Liang Zhao, Qing Guo, Xiaoguang Li, and Song Wang

PDF

TL;DR

This paper introduces CLII, a cross-modal inpainting model that leverages visual and textual information to restore damaged scene text images and complete missing text, outperforming existing methods.

Contribution

The paper proposes a novel cross-modal predictive interaction model for scene text inpainting and text completion, integrating visual and textual cues for improved restoration.

Findings

01

Outperforms baseline methods significantly in experiments.

02

Effectively restores damaged scene text images across various scenarios.

03

Enhances robustness of scene text spotting with missing pixels.

Abstract

Image inpainting aims to fill missing pixels in damaged images and has achieved significant progress with cut-edging learning techniques. Nevertheless, state-of-the-art inpainting methods are mainly designed for nature images and cannot correctly recover text within scene text images, and training existing models on the scene text images cannot fix the issues. In this work, we identify the visual-text inpainting task to achieve high-quality scene text image restoration and text completion: Given a scene text image with unknown missing regions and the corresponding text with unknown missing characters, we aim to complete the missing information in both images and text by leveraging their complementary information. Intuitively, the input text, even if damaged, contains language priors of the contents within the images and can guide the image inpainting. Meanwhile, the scene text image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsInpainting