# Comparing UNet configurations for anthropogenic geomorphic feature extraction from land surface parameters

**Authors:** Sarah Farhadpour, Aaron E. Maxwell, Xiaoyong Sun, Xiaoyong Sun, Xiaoyong Sun

PMC · DOI: 10.1371/journal.pone.0325904 · PLOS One · 2025-06-10

## TL;DR

This paper explores different UNet modifications to improve the extraction of human-made land features from high-resolution data, finding that advanced modules help most when training data is limited.

## Contribution

The study provides insights into optimizing UNet for anthropogenic geomorphic feature extraction with various architectural modifications.

## Key findings

- Advanced UNet modules improve segmentation performance in limited training data scenarios.
- Base UNet performs adequately with larger training sets (e.g., over 500 image chips).
- Modifications like attention gates and residual connections are most beneficial for complex landscapes.

## Abstract

The application of deep learning for semantic segmentation has revolutionized image analysis, particularly in the geospatial and medical fields. UNet, an encoder-decoder architecture, has been suggested to be particularly effective. However, limitations such as small sample sizes and class imbalance in anthropogenic geomorphic feature extraction tasks have necessitated the exploration of advanced modifications to improve model performance. This study investigates a variety of architectural modifications to base UNet including replacing the rectified linear unit (ReLU) activation function with leaky ReLU or swish; incorporating residual connections within the encoder blocks, decoder blocks, and bottleneck; inserting squeeze and excitation modules into the encoder or attention gate modules along the skip connections; replacing the default bottleneck layer with one that incorporates dilated convolution; and using a MobileNetV2 architecture as an encoder backbone. Unique geomorphic datasets derived from high spatial resolution lidar data were used to evaluate the performance of these modified UNet architectures on the tasks of mapping agricultural terraces, mine benches, and valley fill faces. The results were further analyzed across varying training sample sizes (50, 100, 250, 500, and the full training set). Our results suggest that the incorporation of advanced modules can enhance segmentation performance, particularly in scenarios involving limited training data or complex geomorphic landscapes. However, differences were minimal when larger training set sizes were used (e.g., above 500 image chips) and the base UNet architecture was generally adequate. This research contributes valuable insights into the optimization of UNet-based models for anthropogenic geomorphic feature extraction and provides a foundation for future work aimed at improving the accuracy and efficiency of deep learning approaches in geospatial applications. We argue that one of the positive attributes of UNet is that it can be treated as a general framework that can easily be modified.

## Full-text entities

- **Diseases:** OA (MESH:D010003), pneumothorax (MESH:D011030), DL (MESH:D007859), COVID-19 lesions (MESH:D000086382)
- **Chemicals:** PONE-D-25-06225R1 (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12151443/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12151443/full.md

## References

97 references — full list in the complete paper: https://tomesphere.com/paper/PMC12151443/full.md

---
Source: https://tomesphere.com/paper/PMC12151443