# SODU2-NET: a novel deep learning-based approach for salient object detection utilizing U-NET

**Authors:** Hyder Abbas, Shen Bing Ren, Muhammad Asim, Syeda Iqra Hassan, Ahmed A. Abd El-Latif

PMC · DOI: 10.7717/peerj-cs.2623 · 2025-05-19

## TL;DR

This paper introduces SODU2-NET, a new deep learning model that improves the detection of salient objects in complex backgrounds using an enhanced U-NET architecture.

## Contribution

The novel SODU2-NET model introduces a densely supervised encoder-decoder with attention and residual blocks for improved salient object detection.

## Key findings

- SODU2-NET outperforms existing models like FCN, Squeeze-net, Deep Lab, and Mask R-CNN in precision, recall, and accuracy.
- The model achieves superior performance on five public datasets and a new real-world dataset called the Changsha dataset.
- The architecture includes an enriched encoder with attention modules and residual blocks to enhance saliency prediction and map refinement.

## Abstract

Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision. To address this challenge posed by complex backgrounds in salient object detection is crucial for advancing the field. This article proposes a novel deep learning-based architecture called SODU2-NET (Salient object detection U2-Net) for salient object detection that utilizes the U-NET base structure. This model addresses a gap in previous work that focused primarily on complex backgrounds by employing a densely supervised encoder-decoder network. The proposed SODU2-NET employs sophisticated background subtraction techniques and utilizes advanced deep learning architectures that can discern relevant foreground information when dealing with complex backgrounds. Firstly, an enriched encoder block with full feature fusion (FFF) with atrous spatial pyramid pooling (ASPP) varying dilation rates to efficiently capture multi-scale contextual information, improving salient object detection in complex backgrounds and reducing the loss of information during down-sampling. Secondly the block includes an attention module that refines the decoder, is constructed to enhances the detection of salient objects in complex backgrounds by selectively focusing attention on relevant features. This allows the model to reconstruct detailed and contextually relevant information, which is essential to determining salient objects accurately. Finally, the architecture has been improved by adding a residual block at the encoder end, which is responsible for both saliency prediction and map refinement. The proposed network is designed to learn the transformation between input images and ground truth, enabling accurate segmentation of salient object regions with clear borders and accurate prediction of fine structures. SODU2-NET is demonstrated to have superior performance in five public datasets, including DUTS, SOD, DUT OMRON, HKU-IS, PASCAL-S, and a new real world dataset, the Changsha dataset. Based on a comparative assessment of the model FCN, Squeeze-net, Deep Lab, Mask R-CNN the proposed SODU2-NET is found and achieve an improvement of precision (6%), recall (5%) and accuracy (3%). Overall, approach shows promise for improving the accuracy and efficiency of salient object detection in a variety of settings.

## Full-text entities

- **Diseases:** U-NET (MESH:C536925)

## Figures

25 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12190711/full.md

---
Source: https://tomesphere.com/paper/PMC12190711