SCPNet: Unsupervised Cross-modal Homography Estimation via Intra-modal Self-supervised Learning
Runmin Zhang, Jun Ma, Si-Yuan Cao, Lun Luo, Beinan Yu, Shu-Jie Chen,, Junwei Li, Hui-Liang Shen

TL;DR
SCPNet introduces an unsupervised framework for cross-modal homography estimation using intra-modal self-supervised learning, correlation, and feature projection, achieving state-of-the-art results on satellite-map datasets without supervision.
Contribution
The paper presents the first effective unsupervised method for cross-modal homography estimation, leveraging intra-modal self-supervised learning and correlation-based architecture.
Findings
Outperforms supervised methods on satellite-map datasets by 14% MACE.
Achieves state-of-the-art unsupervised performance on multiple datasets.
Reduces mean average corner error significantly compared to prior approaches.
Abstract
We propose a novel unsupervised cross-modal homography estimation framework based on intra-modal Self-supervised learning, Correlation, and consistent feature map Projection, namely SCPNet. The concept of intra-modal self-supervised learning is first presented to facilitate the unsupervised cross-modal homography estimation. The correlation-based homography estimation network and the consistent feature map projection are combined to form the learnable architecture of SCPNet, boosting the unsupervised learning framework. SCPNet is the first to achieve effective unsupervised homography estimation on the satellite-map image pair cross-modal dataset, GoogleMap, under [-32,+32] offset on a 128x128 image, leading the supervised approach MHN by 14.0% of mean average corner error (MACE). We further conduct extensive experiments on several cross-modal/spectral and manually-made inconsistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques
