SCPNet: Unsupervised Cross-modal Homography Estimation via Intra-modal   Self-supervised Learning

Runmin Zhang; Jun Ma; Si-Yuan Cao; Lun Luo; Beinan Yu; Shu-Jie Chen,; Junwei Li; Hui-Liang Shen

arXiv:2407.08148·cs.CV·July 12, 2024

SCPNet: Unsupervised Cross-modal Homography Estimation via Intra-modal Self-supervised Learning

Runmin Zhang, Jun Ma, Si-Yuan Cao, Lun Luo, Beinan Yu, Shu-Jie Chen,, Junwei Li, Hui-Liang Shen

PDF

Open Access 1 Repo

TL;DR

SCPNet introduces an unsupervised framework for cross-modal homography estimation using intra-modal self-supervised learning, correlation, and feature projection, achieving state-of-the-art results on satellite-map datasets without supervision.

Contribution

The paper presents the first effective unsupervised method for cross-modal homography estimation, leveraging intra-modal self-supervised learning and correlation-based architecture.

Findings

01

Outperforms supervised methods on satellite-map datasets by 14% MACE.

02

Achieves state-of-the-art unsupervised performance on multiple datasets.

03

Reduces mean average corner error significantly compared to prior approaches.

Abstract

We propose a novel unsupervised cross-modal homography estimation framework based on intra-modal Self-supervised learning, Correlation, and consistent feature map Projection, namely SCPNet. The concept of intra-modal self-supervised learning is first presented to facilitate the unsupervised cross-modal homography estimation. The correlation-based homography estimation network and the consistent feature map projection are combined to form the learnable architecture of SCPNet, boosting the unsupervised learning framework. SCPNet is the first to achieve effective unsupervised homography estimation on the satellite-map image pair cross-modal dataset, GoogleMap, under [-32,+32] offset on a 128x128 image, leading the supervised approach MHN by 14.0% of mean average corner error (MACE). We further conduct extensive experiments on several cross-modal/spectral and manually-made inconsistent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rm-zhang/scpnet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques