Multi-level Cross-modal Feature Alignment via Contrastive Learning towards Zero-shot Classification of Remote Sensing Image Scenes
Chun Liu, Suqiang Ma, Zheng Li, Wei Yang, Zhigang Han

TL;DR
This paper introduces a multi-level contrastive learning approach for zero-shot remote sensing image scene classification, effectively aligning cross-modal features and improving classification accuracy without extensive labeled data.
Contribution
It proposes a novel multi-level contrastive learning method that considers cross-instance relationships, enhancing zero-shot classification performance in remote sensing images.
Findings
Outperforms state-of-the-art zero-shot classification methods
Effectively handles intra-class variation and inter-class similarity
Demonstrates robustness to noisy samples
Abstract
Zero-shot classification of image scenes which can recognize the image scenes that are not seen in the training stage holds great promise of lowering the dependence on large numbers of labeled samples. To address the zero-shot image scene classification, the cross-modal feature alignment methods have been proposed in recent years. These methods mainly focus on matching the visual features of each image scene with their corresponding semantic descriptors in the latent space. Less attention has been paid to the contrastive relationships between different image scenes and different semantic descriptors. In light of the challenge of large intra-class difference and inter-class similarity among image scenes and the potential noisy samples, these methods are susceptible to the influence of the instances which are far from these of the same classes and close to these of other classes. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning · interferon and immune responses
MethodsContrastive Learning · Focus
