Cross Aggregation Transformer for Image Restoration

Zheng Chen; Yulun Zhang; Jinjin Gu; Yongbing Zhang; Linghe Kong; Xin; Yuan

arXiv:2211.13654·cs.CV·March 24, 2023·122 cites

Cross Aggregation Transformer for Image Restoration

Zheng Chen, Yulun Zhang, Jinjin Gu, Yongbing Zhang, Linghe Kong, Xin, Yuan

PDF

Open Access 3 Repos 1 Video

TL;DR

This paper introduces the Cross Aggregation Transformer (CAT), a novel image restoration model that combines rectangle-window self-attention with local-global feature coupling to improve long-range dependency modeling and performance.

Contribution

The paper proposes the Cross Aggregation Transformer with Rectangle-Window Self-Attention and Axial-Shift, integrating CNN inductive biases for enhanced image restoration.

Findings

01

Outperforms state-of-the-art methods on multiple image restoration tasks

02

Effectively models long-range dependencies with rectangle-window attention

03

Enhances local-global feature integration through the Locality Complementary Module

Abstract

Recently, Transformer architecture has been introduced into image restoration to replace convolution neural network (CNN) with surprising results. Considering the high computational complexity of Transformer with global attention, some methods use the local square window to limit the scope of self-attention. However, these methods lack direct interaction among different windows, which limits the establishment of long-range dependencies. To address the above issue, we propose a new image restoration model, Cross Aggregation Transformer (CAT). The core of our CAT is the Rectangle-Window Self-Attention (Rwin-SA), which utilizes horizontal and vertical rectangle window attention in different heads parallelly to expand the attention area and aggregate the features cross different windows. We also introduce the Axial-Shift operation for different window interactions. Furthermore, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Cross Aggregation Transformer for Image Restoration· slideslive

Taxonomy

TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Advanced Image Fusion Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Adam · Softmax · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Label Smoothing · Convolution