Channel-Partitioned Windowed Attention And Frequency Learning for Single   Image Super-Resolution

Dinh Phu Tran; Dao Duy Hung; Daeyoung Kim

arXiv:2407.16232·cs.CV·August 28, 2024

Channel-Partitioned Windowed Attention And Frequency Learning for Single Image Super-Resolution

Dinh Phu Tran, Dao Duy Hung, Daeyoung Kim

PDF

TL;DR

This paper introduces a novel transformer-based approach for single image super-resolution that captures long-range dependencies and integrates spatial and frequency domain information, leading to improved performance over existing methods.

Contribution

The paper proposes the Channel-Partitioned Attention Transformer and Spatial-Frequency Interaction Module to enhance long-range dependency modeling and frequency content learning in SISR.

Findings

01

CPAT surpasses state-of-the-art methods by up to 0.31dB at x2 SR on Urban100.

02

The proposed modules effectively improve super-resolution quality.

03

Experimental results validate the effectiveness of the architecture.

Abstract

Recently, window-based attention methods have shown great potential for computer vision tasks, particularly in Single Image Super-Resolution (SISR). However, it may fall short in capturing long-range dependencies and relationships between distant tokens. Additionally, we find that learning on spatial domain does not convey the frequency content of the image, which is a crucial aspect in SISR. To tackle these issues, we propose a new Channel-Partitioned Attention Transformer (CPAT) to better capture long-range dependencies by sequentially expanding windows along the height and width of feature maps. In addition, we propose a novel Spatial-Frequency Interaction Module (SFIM), which incorporates information from spatial and frequency domains to provide a more comprehensive information from feature maps. This includes information about the frequency content and enhances the receptive field…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsByte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Softmax · Attention Is All You Need · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Multi-Head Attention · Dense Connections