Channel-Partitioned Windowed Attention And Frequency Learning for Single Image Super-Resolution
Dinh Phu Tran, Dao Duy Hung, Daeyoung Kim

TL;DR
This paper introduces a novel transformer-based approach for single image super-resolution that captures long-range dependencies and integrates spatial and frequency domain information, leading to improved performance over existing methods.
Contribution
The paper proposes the Channel-Partitioned Attention Transformer and Spatial-Frequency Interaction Module to enhance long-range dependency modeling and frequency content learning in SISR.
Findings
CPAT surpasses state-of-the-art methods by up to 0.31dB at x2 SR on Urban100.
The proposed modules effectively improve super-resolution quality.
Experimental results validate the effectiveness of the architecture.
Abstract
Recently, window-based attention methods have shown great potential for computer vision tasks, particularly in Single Image Super-Resolution (SISR). However, it may fall short in capturing long-range dependencies and relationships between distant tokens. Additionally, we find that learning on spatial domain does not convey the frequency content of the image, which is a crucial aspect in SISR. To tackle these issues, we propose a new Channel-Partitioned Attention Transformer (CPAT) to better capture long-range dependencies by sequentially expanding windows along the height and width of feature maps. In addition, we propose a novel Spatial-Frequency Interaction Module (SFIM), which incorporates information from spatial and frequency domains to provide a more comprehensive information from feature maps. This includes information about the frequency content and enhances the receptive field…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsByte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Softmax · Attention Is All You Need · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Multi-Head Attention · Dense Connections
