Learning to Balance: Decoupled Siamese Diffusion Transformer for Reference-Based Remote Sensing Image Super-Resolution
Bin Luo, Runmin Dong, Zhaoyang Luo, Jinxiao Zhang, Jiyao Zhao, Fan Wei, and Haohuan Fu

TL;DR
This paper introduces DS-DiT, a novel decoupled Siamese diffusion transformer for reference-based remote sensing image super-resolution, effectively balancing reference information utilization and detail recovery.
Contribution
The paper proposes a decoupled attention mechanism and a patch-level weighting module to improve super-resolution quality in remote sensing images.
Findings
DS-DiT outperforms existing methods in quantitative metrics.
The approach enhances visual fidelity of super-resolved images.
The method effectively balances reference reliance and detail recovery.
Abstract
Diffusion-based methods demonstrate significant potential for remote sensing image super-resolution at large scaling factors, particularly in reference-based super-resolution (RefSR) where high-resolution reference images provide critical fine-grained texture priors. However, existing methods often suffer from a trade-off between over-reliance on reference information, which leads to texture artifacts, and underutilization, which results in insufficient detail recovery. To address these issues, we propose DS-DiT, a Decoupled Siamese Diffusion Transformer method that decouples low-resolution and reference interactions at the attention level. By enabling low-resolution structural priors and reference texture information to interact independently with the noisy latent, the framework effectively mitigates inter-source competition. Furthermore, to compensate for the limited local modeling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
