Efficient Remote Sensing with Harmonized Transfer Learning and Modality   Alignment

Tengjun Huang

arXiv:2404.18253·cs.CV·May 29, 2024·1 cites

Efficient Remote Sensing with Harmonized Transfer Learning and Modality Alignment

Tengjun Huang

PDF

Open Access 1 Repo

TL;DR

This paper introduces HarMA, a novel method for remote sensing that enhances transfer learning and modality alignment, achieving state-of-the-art results with minimal training overhead and broad applicability.

Contribution

HarMA is a new approach that simultaneously addresses task constraints, modality alignment, and uniformity, improving multimodal transfer learning efficiency in remote sensing.

Findings

01

HarMA achieves state-of-the-art performance in remote sensing retrieval tasks.

02

HarMA outperforms fully fine-tuned models with fewer parameters.

03

HarMA is compatible with existing multimodal pretraining models.

Abstract

With the rise of Visual and Language Pretraining (VLP), an increasing number of downstream tasks are adopting the paradigm of pretraining followed by fine-tuning. Although this paradigm has demonstrated potential in various multimodal downstream tasks, its implementation in the remote sensing domain encounters some obstacles. Specifically, the tendency for same-modality embeddings to cluster together impedes efficient transfer learning. To tackle this issue, we review the aim of multimodal transfer learning for downstream tasks from a unified perspective, and rethink the optimization process based on three distinct objectives. We propose "Harmonized Transfer Learning and Modality Alignment (HarMA)", a method that simultaneously satisfies task constraints, modality alignment, and single-modality uniform alignment, while minimizing training overhead through parameter-efficient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seekerhuang/harma
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRemote-Sensing Image Classification · Remote Sensing and Land Use