HINT: High-quality INPainting Transformer with Mask-Aware Encoding and   Enhanced Attention

Shuang Chen; Amir Atapour-Abarghouei; Hubert P. H. Shum

arXiv:2402.14185·cs.CV·February 23, 2024·2 cites

HINT: High-quality INPainting Transformer with Mask-Aware Encoding and Enhanced Attention

Shuang Chen, Amir Atapour-Abarghouei, Hubert P. H. Shum

PDF

Open Access 1 Repo

TL;DR

HINT introduces a novel transformer-based image inpainting method with mask-aware encoding and enhanced attention, effectively handling large missing regions and outperforming current models on multiple datasets.

Contribution

The paper proposes a new inpainting transformer with a mask-aware pixel-shuffle downsampling and a spatially-activated channel attention layer, improving long-range modeling and information preservation.

Findings

01

Outperforms state-of-the-art models on CelebA, CelebA-HQ, Places2, and Dunhuang datasets.

02

Effectively preserves visible information in large missing regions.

03

Demonstrates superior inpainting quality and detail recovery.

Abstract

Existing image inpainting methods leverage convolution-based downsampling approaches to reduce spatial dimensions. This may result in information loss from corrupted images where the available information is inherently sparse, especially for the scenario of large missing regions. Recent advances in self-attention mechanisms within transformers have led to significant improvements in many computer vision tasks including inpainting. However, limited by the computational costs, existing methods cannot fully exploit the efficacy of long-range modelling capabilities of such models. In this paper, we propose an end-to-end High-quality INpainting Transformer, abbreviated as HINT, which consists of a novel mask-aware pixel-shuffle downsampling module (MPD) to preserve the visible information extracted from the corrupted image while maintaining the integrity of the information available for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chrischen1023/hint
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection · Image Enhancement Techniques · Advancements in Photolithography Techniques

MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Adam · Softmax · Multi-Head Attention · Layer Normalization · Residual Connection · Absolute Position Encodings