CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding
Mingming Zhang, Qingjie Liu, and Yunhong Wang

TL;DR
This paper introduces CtxMIM, a self-supervised masked image modeling approach that enhances remote sensing image understanding by incorporating contextual information, leading to superior performance on multiple downstream tasks.
Contribution
The paper presents a novel context-enhanced masked image modeling method (CtxMIM) specifically designed for remote sensing images, improving feature learning without relying on temporal or geographical constraints.
Findings
Outperforms supervised and self-supervised methods on land cover classification
Achieves superior results in semantic segmentation, object detection, and instance segmentation
Demonstrates high generalization and transferability of learned features
Abstract
Learning representations through self-supervision on unlabeled data has proven highly effective for understanding diverse images. However, remote sensing images often have complex and densely populated scenes with multiple land objects and no clear foreground objects. This intrinsic property generates high object density, resulting in false positive pairs or missing contextual information in self-supervised learning. To address these problems, we propose a context-enhanced masked image modeling method (CtxMIM), a simple yet efficient MIM-based self-supervised learning for remote sensing image understanding. CtxMIM formulates original image patches as a reconstructive template and employs a Siamese framework to operate on two sets of image patches. A context-enhanced generative branch is introduced to provide contextual information through context consistency constraints in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Advanced Image and Video Retrieval Techniques · Remote Sensing and Land Use
