# LGD-DeepLabV3+: An Enhanced Framework for Remote Sensing Semantic Segmentation via Multi-Level Feature Fusion and Global Modeling

**Authors:** Xin Wang, Xu Liu, Adnan Mahmood, Yaxin Yang, Xipeng Li

PMC · DOI: 10.3390/s26031008 · Sensors (Basel, Switzerland) · 2026-02-03

## TL;DR

This paper introduces LGD-DeepLabV3+, an improved model for remote sensing image segmentation that enhances accuracy and boundary clarity through multi-level feature fusion and global modeling.

## Contribution

The novel contributions include a feature-mapping network, a routing-style global modeling module, and an enhanced decoder fusion module for better segmentation performance.

## Key findings

- The model improves mIoU by 8.83% on the LoveDA dataset and 6.72% on the ISPRS Potsdam dataset.
- Qualitative results show clearer boundaries and more stable region annotations.
- The proposed modules are plug-and-play and suitable for integration into remote sensing pipelines.

## Abstract

Remote sensing semantic segmentation encounters several challenges, including scale variation, the coexistence of class similarity and intra-class diversity, difficulties in modeling long-range dependencies, and shadow occlusions. Slender structures and complex boundaries present particular segmentation difficulties, especially in high-resolution imagery acquired by satellite and aerial cameras, UAV-borne optical sensors, and other imaging payloads. These sensing systems deliver large-area coverage with fine ground sampling distance, which magnifies domain shifts between different sensors and acquisition conditions. This work builds upon DeepLabV3+ and proposes complementary improvements at three stages: input, context, and decoder fusion. First, to mitigate the interference of complex and heterogeneous data distributions on network optimization, a feature-mapping network is introduced to project raw images into a simpler distribution before they are fed into the segmentation backbone. This approach facilitates training and enhances feature separability. Second, although the Atrous Spatial Pyramid Pooling (ASPP) aggregates multi-scale context, it remains insufficient for modeling long-range dependencies. Therefore, a routing-style global modeling module is incorporated after ASPP to strengthen global relation modeling and ensure cross-region semantic consistency. Third, considering that the fusion between shallow details and deep semantics in the decoder is limited and prone to boundary blurring, a fusion module is designed to facilitate deep interaction and joint learning through cross-layer feature alignment and coupling. The proposed model improves the mean Intersection over Union (mIoU) by 8.83% on the LoveDA dataset and by 6.72% on the ISPRS Potsdam dataset compared to the baseline. Qualitative results further demonstrate clearer boundaries and more stable region annotations, while the proposed modules are plug-and-play and easy to integrate into camera-based remote sensing pipelines and other imaging-sensor systems, providing a practical accuracy–efficiency trade-off.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12900102/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12900102/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12900102/full.md

---
Source: https://tomesphere.com/paper/PMC12900102