Robust Self-Supervised Cross-Modal Super-Resolution against Real-World Misaligned Observations

Xiaoyu Dong; Jiahuan Li; Ziteng Cui; Naoto Yokoya

arXiv:2602.18822·cs.CV·March 9, 2026

Robust Self-Supervised Cross-Modal Super-Resolution against Real-World Misaligned Observations

Xiaoyu Dong, Jiahuan Li, Ziteng Cui, Naoto Yokoya

PDF

Open Access

TL;DR

RobSelf is a self-supervised cross-modal super-resolution method that effectively handles real-world misaligned data by jointly optimizing a misalignment-aware feature translator and a content-aware reference filter, achieving state-of-the-art results.

Contribution

It introduces a novel self-supervised framework with online joint optimization of alignment and enhancement modules for real-world cross-modal SR.

Findings

01

Outperforms existing self-supervised and supervised methods.

02

Achieves up to 15.3× faster processing than prior self-supervised approaches.

03

Demonstrates superior performance on both synthesized and real-world data.

Abstract

Cross-modal super-resolution (SR) on real-world misaligned data is challenging, as only unlabeled low-resolution (LR) source and high-resolution (HR) guide images with complex spatial misalignment are available. Previous methods either rely on fully simulated training data or adopt suboptimal alignment strategies that overlook cross-modal dependencies, limiting their performance in practice. To address these issues, we propose RobSelf, a self-supervised model that jointly optimizes a misalignment-aware feature translator and a content-aware reference filter online. The translator resolves unsupervised cross-modal and cross-resolution alignment via weakly-supervised, misalignment-aware translation, yielding an aligned guide feature. Guided by this feature, the filter performs reference-based discriminative self-enhancement on the source, enabling SR prediction with high resolution and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning