Spatial-Frequency Gated Swin Transformer for Remote Sensing Single-Image Super-Resolution

Md Aminur Hossain; Parekh Valkesh; Ayush V. Patel; Yogesh Jethani; Sanjay K. Singh; Biplab Banerjee

arXiv:2605.09687·cs.CV·May 12, 2026

Spatial-Frequency Gated Swin Transformer for Remote Sensing Single-Image Super-Resolution

Md Aminur Hossain, Parekh Valkesh, Ayush V. Patel, Yogesh Jethani, Sanjay K. Singh, Biplab Banerjee

PDF

TL;DR

This paper introduces SFG-SwinSR, a novel remote sensing super-resolution model that enhances detail reconstruction by integrating spatial-frequency gated mechanisms into the Swin Transformer architecture.

Contribution

It replaces standard feed-forward networks with a spatial-frequency gated module, improving high-frequency detail recovery in remote sensing image super-resolution.

Findings

01

Achieves 45.19 dB PSNR on SpaceNet dataset.

02

Attains 0.9852 SSIM, indicating high structural similarity.

03

Outperforms previous models in detail preservation.

Abstract

Remote Sensing (RS) single-image super-resolution aims to reconstruct high-resolution imagery from low-resolution observations while preserving fine spatial structures. Recent Swin Transformer-based models, including Swin2SR, provide strong spatial context modeling throughshifted-window self-attention, but their feed-forward networks remain generic channel-mixing modules and do not separate low-frequency structural content from high-frequency residual detail. To address this limitation, we propose SFG-SwinSR, a Spatial-Frequency Gated Swin Transformer for single-image super-resolution in remote sensing. SFG-SwinSR modifies the original Swin2SR attention block by replacing each transformer block's standard feed-forward network with a lightweight Spatial-Frequency Gated Feed-Forward Network (SFG-FFN). The module estimates low-frequency content via a depthwise-blur branch, extracts…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.