F3Net: Fusion, Feedback and Focus for Salient Object Detection
Jun Wei, Shuhui Wang, Qingming Huang

TL;DR
F3Net introduces a novel feature fusion and feedback mechanism with a pixel position aware loss to improve salient object detection, achieving superior accuracy and detail preservation over existing methods.
Contribution
The paper proposes CFM and CFD modules with a new PPA loss, effectively addressing feature differences and local detail preservation in salient object detection.
Findings
Outperforms state-of-the-art on five benchmark datasets
Achieves higher scores on six evaluation metrics
Effectively preserves local details and boundary accuracy
Abstract
Most of existing salient object detection models have achieved great progress by aggregating multi-level features extracted from convolutional neural networks. However, because of the different receptive fields of different convolutional layers, there exists big differences between features generated by these layers. Common feature fusion strategies (addition or concatenation) ignore these differences and may cause suboptimal solutions. In this paper, we propose the F3Net to solve above problem, which mainly consists of cross feature module (CFM) and cascaded feedback decoder (CFD) trained by minimizing a new pixel position aware loss (PPA). Specifically, CFM aims to selectively aggregate multi-level features. Different from addition and concatenation, CFM adaptively selects complementary components from input features before fusion, which can effectively avoid introducing too much…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Face Recognition and Perception
