From Pixels to Images: A Structural Survey of Deep Learning Paradigms in Remote Sensing Image Semantic Segmentation
Quanwei Liu, Tao Huang, Jiaqi Yang, Wei Xiang

TL;DR
This survey comprehensively reviews deep learning methods for remote sensing image semantic segmentation, organizing them into a hierarchical framework from pixel to image level, and discusses datasets, challenges, and reproducibility resources.
Contribution
It provides a unified, structured overview of DL-based RSISS across different segmentation granularities, addressing gaps in existing reviews.
Findings
Deep learning has significantly advanced RSISS accuracy.
Emerging image-level models are pushing the boundaries of segmentation.
Open challenges include data scale, model efficiency, and multimodal integration.
Abstract
Semantic segmentation (SS) of RSIs enables the fine-grained interpretation of surface features, making it a critical task in RS analysis. With the increasing diversity and volume of RSIs collected by sensors on various platforms, traditional processing methods struggle to maintain efficiency and accuracy. In response, deep learning (DL) has emerged as a transformative approach, enabling substantial advances in remote sensing image semantic segmentation (RSISS) by automating hierarchical feature extraction and improving segmentation performance across diverse modalities. As data scale and model capacity have increased, DL-based RSISS has undergone a structural evolution from pixel-level and patch-based classification to tile-level, end-to-end segmentation, and, more recently, to image-level modelling with vision foundation models. However, existing reviews often focus on individual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification
