Are Pretrained Image Matchers Good Enough for SAR-Optical Satellite Registration?
Isaac Corley, Alex Stoken, Gabriele Berton

TL;DR
This paper evaluates the effectiveness of pretrained image matchers for cross-modal SAR-optical satellite registration without fine-tuning, revealing insights into their transferability and protocol sensitivity.
Contribution
It provides a comprehensive zero-shot benchmarking of 24 pretrained matchers on satellite data, highlighting the role of foundation-model features and deployment protocols.
Findings
Explicit cross-modal training does not always outperform non-cross-modal matchers.
RoMa achieves the lowest mean error without cross-modal training.
Deployment choices significantly impact registration accuracy, sometimes more than the matcher selection.
Abstract
Cross-modal optical-SAR (Synthetic Aperture Radar) registration is a bottleneck for disaster-response via remote sensing, yet modern image matchers are developed and benchmarked almost exclusively on natural-image domains. We evaluate twenty-four pretrained matcher families--in a zero-shot setting with no fine-tuning or domain adaptation on satellite or SAR data--on SpaceNet9 and two additional cross-modal benchmarks under a deterministic protocol with tiled large-image inference, robust geometric filtering, and tie-point-grounded metrics. Our results reveal asymmetric transfer--matchers with explicit cross-modal training do not uniformly outperform those without it. While XoFTR (trained for visible-thermal matching) and RoMa achieve the lowest reported mean error at px on the labeled SpaceNet9 training scenes, RoMa achieves this without any cross-modal training, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
