Deep feature selection-and-fusion for RGB-D semantic segmentation
Yuejiao Su, Yuan Yuan, Zhiyu Jiang

TL;DR
This paper introduces FSFNet, a novel deep learning model that explicitly fuses multi-modality RGB-D data for improved semantic segmentation, addressing the loss of critical features in deep networks.
Contribution
The work presents a unified feature selection and fusion network with explicit multi-modality fusion and detailed feature propagation, enhancing segmentation accuracy over existing methods.
Findings
Achieves competitive performance on public datasets.
Outperforms state-of-the-art methods in accuracy.
Effectively maintains low-level details during segmentation.
Abstract
Scene depth information can help visual information for more accurate semantic segmentation. However, how to effectively integrate multi-modality information into representative features is still an open problem. Most of the existing work uses DCNNs to implicitly fuse multi-modality information. But as the network deepens, some critical distinguishing features may be lost, which reduces the segmentation performance. This work proposes a unified and efficient feature selectionand-fusion network (FSFNet), which contains a symmetric cross-modality residual fusion module used for explicit fusion of multi-modality information. Besides, the network includes a detailed feature propagation module, which is used to maintain low-level detailed information during the forward process of the network. Compared with the state-of-the-art methods, experimental evaluations demonstrate that the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging
