Robust Self-Supervised Cross-Modal Super-Resolution against Real-World Misaligned Observations
Xiaoyu Dong, Jiahuan Li, Ziteng Cui, Naoto Yokoya

TL;DR
RobSelf is a self-supervised cross-modal super-resolution method that effectively handles real-world misaligned data by jointly optimizing a misalignment-aware feature translator and a content-aware reference filter, achieving state-of-the-art results.
Contribution
It introduces a novel self-supervised framework with online joint optimization of alignment and enhancement modules for real-world cross-modal SR.
Findings
Outperforms existing self-supervised and supervised methods.
Achieves up to 15.3× faster processing than prior self-supervised approaches.
Demonstrates superior performance on both synthesized and real-world data.
Abstract
Cross-modal super-resolution (SR) on real-world misaligned data is challenging, as only unlabeled low-resolution (LR) source and high-resolution (HR) guide images with complex spatial misalignment are available. Previous methods either rely on fully simulated training data or adopt suboptimal alignment strategies that overlook cross-modal dependencies, limiting their performance in practice. To address these issues, we propose RobSelf, a self-supervised model that jointly optimizes a misalignment-aware feature translator and a content-aware reference filter online. The translator resolves unsupervised cross-modal and cross-resolution alignment via weakly-supervised, misalignment-aware translation, yielding an aligned guide feature. Guided by this feature, the filter performs reference-based discriminative self-enhancement on the source, enabling SR prediction with high resolution and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
