MxT: Mamba x Transformer for Image Inpainting

Shuang Chen; Amir Atapour-Abarghouei; Haozheng Zhang; Hubert P. H.; Shum

arXiv:2407.16126·cs.CV·August 19, 2024·2 cites

MxT: Mamba x Transformer for Image Inpainting

Shuang Chen, Amir Atapour-Abarghouei, Haozheng Zhang, Hubert P. H., Shum

PDF

Open Access 1 Repo

TL;DR

MxT introduces a hybrid model combining Mamba and transformer modules for efficient and high-quality image inpainting, effectively capturing local textures and global context.

Contribution

The paper proposes a novel Hybrid Module that synergistically combines Mamba and transformer architectures for improved image inpainting.

Findings

01

Outperforms state-of-the-art methods on CelebA-HQ and Places2 datasets.

02

Efficient long-range data interaction with linear computational costs.

03

Enhances image reconstruction quality and contextual accuracy.

Abstract

Image inpainting, or image completion, is a crucial task in computer vision that aims to restore missing or damaged regions of images with semantically coherent content. This technique requires a precise balance of local texture replication and global contextual understanding to ensure the restored image integrates seamlessly with its surroundings. Traditional methods using Convolutional Neural Networks (CNNs) are effective at capturing local patterns but often struggle with broader contextual relationships due to the limited receptive fields. Recent advancements have incorporated transformers, leveraging their ability to understand global interactions. However, these methods face computational inefficiencies and struggle to maintain fine-grained details. To overcome these challenges, we introduce MxT composed of the proposed Hybrid Module (HM), which combines Mamba with the transformer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chrischen1023/mxt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection · Image Enhancement Techniques · Advanced Vision and Imaging

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces