Contrastive Multiview Coding with Electro-optics for SAR Semantic Segmentation
Keumgang Cha, Junghoon Seo, Yeji Choi

TL;DR
This paper introduces a multi-modal contrastive learning approach combining SAR, electro-optical images, and label masks to improve semantic segmentation in remote sensing, enhancing model performance and training efficiency.
Contribution
It proposes a novel multi-modal representation learning method that jointly leverages SAR, EO imagery, and label masks for better semantic segmentation in remote sensing.
Findings
Outperforms existing methods in model accuracy
Improves sample efficiency during training
Speeds up convergence of the learning process
Abstract
In the training of deep learning models, how the model parameters are initialized greatly affects the model performance, sample efficiency, and convergence speed. Representation learning for model initialization has recently been actively studied in the remote sensing field. In particular, the appearance characteristics of the imagery obtained using the a synthetic aperture radar (SAR) sensor are quite different from those of general electro-optical (EO) images, and thus representation learning is even more important in remote sensing domain. Motivated from contrastive multiview coding, we propose multi-modal representation learning for SAR semantic segmentation. Unlike previous studies, our method jointly uses EO imagery, SAR imagery, and a label mask. Several experiments show that our approach is superior to the existing methods in model performance, sample efficiency, and convergence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
