# Building extraction from remote sensing images based on multi-scale attention gate and enhanced positional information

**Authors:** Rui Xu, Renzhong Mao, Zhenxing Zhuang, Fenghua Huang, Yihui Yang

PMC · DOI: 10.7717/peerj-cs.2826 · 2025-04-21

## TL;DR

This paper introduces a new method for extracting buildings from satellite images using deep learning with improved edge accuracy and structure preservation.

## Contribution

A novel building extraction framework combining multi-scale attention gates and enhanced positional information for better accuracy and detail.

## Key findings

- The proposed method outperforms six state-of-the-art models on three benchmark datasets in building extraction.
- Multi-scale attention gate improves multi-scale feature capture, while enhanced positional information sharpens building edges.
- Intersection over union (IoU) metrics show consistent improvements across datasets.

## Abstract

Extracting buildings from high-resolution remote sensing images is currently a research hotspot in the field of remote sensing applications. Deep learning methods have significantly improved the accuracy of building extraction, but there are still deficiencies such as blurred edges, incomplete structures and loss of details in the extraction results. To obtain accurate contours and clear boundaries of buildings, this article proposes a novel building extraction method utilizing multi-scale attention gate and enhanced positional information. By employing U-Net as the main framework, this article introduces a multi-scale attention gate module in the encoder, which effectively improves the ability to capture multi-scale information, and designs a module in the decoder to enhance the positional information of the features, allowing for more precise localization and extraction of the shape and edge information of buildings. To validate the effectiveness of the proposed method, comprehensive evaluations were conducted on three benchmark datasets, Massachusetts, WHU, and Inria. The comparative analysis with six state-of-the-art models (SegNet, DeepLabv3+, U-Net, DSATNet, SDSC-Unet, and BuildFormer) demonstrates consistent performance improvements in intersection over union (IoU) metrics. Specifically, the proposed method achieves IoU increments of 2.19%, 3.31%, 3.10%, 2.00%, 3.35%, and 3.48% respectively on Massachusetts dataset, 1.26%, 4.18%, 1.18%, 2.01%, 2.03%, and 2.29% on WHU dataset, and 0.87%, 5.25%, 2.02%, 5.55%, 4.39%, and 1.18% on Inria dataset. The experimental results indicate that the proposed method can effectively integrate multi-scale features and optimize the extracted building edges, achieving superior performance compared to existing methodologies in building extraction tasks.

## Figures

36 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12190511/full.md

---
Source: https://tomesphere.com/paper/PMC12190511