CFAT: Unleashing TriangularWindows for Image Super-resolution

Abhisek Ray; Gaurav Kumar; and Maheshkumar H. Kolekar

arXiv:2403.16143·eess.IV·March 26, 2024·1 cites

CFAT: Unleashing TriangularWindows for Image Super-resolution

Abhisek Ray, Gaurav Kumar, and Maheshkumar H. Kolekar

PDF

Open Access 1 Repo

TL;DR

This paper introduces CFAT, a novel transformer-based model for image super-resolution that combines triangular and rectangular window techniques to better capture features and reduce boundary distortion, leading to improved performance.

Contribution

The paper proposes a new non-overlapping triangular window technique integrated with rectangular windows in a transformer model for enhanced image super-resolution.

Findings

01

Achieves 0.7 dB higher PSNR than state-of-the-art models

02

Effectively reduces boundary distortion in super-resolution tasks

03

Captures long-range, multi-scale features for better image quality

Abstract

Transformer-based models have revolutionized the field of image super-resolution (SR) by harnessing their inherent ability to capture complex contextual features. The overlapping rectangular shifted window technique used in transformer architecture nowadays is a common practice in super-resolution models to improve the quality and robustness of image upscaling. However, it suffers from distortion at the boundaries and has limited unique shifting modes. To overcome these weaknesses, we propose a non-overlapping triangular window technique that synchronously works with the rectangular one to mitigate boundary-level distortion and allows the model to access more unique sifting modes. In this paper, we propose a Composite Fusion Attention Transformer (CFAT) that incorporates triangular-rectangular window-based local attention with a channel-based global attention technique in image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rayabhisek123/cfat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Advanced Vision and Imaging · Image and Signal Denoising Methods

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Multi-Head Attention · Softmax · Dropout