TL;DR
U$^2$-Net introduces a nested U-structure deep network for salient object detection, capturing multi-scale context efficiently and achieving high performance without relying on pre-trained backbones.
Contribution
It proposes a novel two-level nested U-structure architecture with ReSidual U-blocks that enhances contextual feature extraction and depth without high computational costs.
Findings
Achieves competitive results on six SOD datasets.
Offers two models suitable for different environments.
Demonstrates effective deep learning from scratch without pre-trained backbones.
Abstract
In this paper, we design a simple yet powerful deep network architecture, U-Net, for salient object detection (SOD). The architecture of our U-Net is a two-level nested U-structure. The design has the following advantages: (1) it is able to capture more contextual information from different scales thanks to the mixture of receptive fields of different sizes in our proposed ReSidual U-blocks (RSU), (2) it increases the depth of the whole architecture without significantly increasing the computational cost because of the pooling operations used in these RSU blocks. This architecture enables us to train a deep network from scratch without using backbones from image classification tasks. We instantiate two models of the proposed architecture, U-Net (176.3 MB, 30 FPS on GTX 1080Ti GPU) and U-Net (4.7 MB, 40 FPS), to facilitate the usage in different environments.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsU2-Net
