Receptive Field Broadening and Boosting for Salient Object Detection
Mingcan Ma, Changqun Xia, Chenxi Xie, Xiaowu Chen, Jia Li

TL;DR
This paper introduces a novel bilateral transformer-CNN network with Multi-Head Boosting and Attention Feature Fusion for more accurate and efficient salient object detection, addressing the limitations of traditional transformers and multi-branch modules.
Contribution
It proposes a bilateral network combining transformer and CNN, a Multi-Head Boosting strategy, and an Attention Feature Fusion module to improve salient object detection performance.
Findings
Significant performance improvement over state-of-the-art methods.
Effective enhancement of local details and global semantics.
Efficient training with boosted branch selection.
Abstract
Salient object detection requires a comprehensive and scalable receptive field to locate the visually significant objects in the image. Recently, the emergence of visual transformers and multi-branch modules has significantly enhanced the ability of neural networks to perceive objects at different scales. However, compared to the traditional backbone, the calculation process of transformers is time-consuming. Moreover, different branches of the multi-branch modules could cause the same error back propagation in each training iteration, which is not conducive to extracting discriminative features. To solve these problems, we propose a bilateral network based on transformer and CNN to efficiently broaden local details and global semantic information simultaneously. Besides, a Multi-Head Boosting (MHB) strategy is proposed to enhance the specificity of different network branches. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Olfactory and Sensory Function Studies · Advanced Image and Video Retrieval Techniques
