Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation
Xiaokang Chen, Kwan-Yee Lin, Jingbo Wang, Wayne Wu, Chen Qian,, Hongsheng Li, Gang Zeng

TL;DR
This paper introduces a novel cross-modality encoder with separation-and-aggregation gating for RGB-D semantic segmentation, effectively handling noisy depth data and improving feature fusion between RGB and depth modalities.
Contribution
The work proposes a unified encoder with a separation-and-aggregation gating mechanism and bi-directional multi-step propagation to enhance RGB-D feature fusion and segmentation accuracy.
Findings
Outperforms state-of-the-art methods on challenging datasets
Effectively handles noisy and misaligned depth data
Improves feature calibration and fusion in RGB-D segmentation
Abstract
Depth information has proven to be a useful cue in the semantic segmentation of RGB-D images for providing a geometric counterpart to the RGB representation. Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion to obtain better feature representations to achieve more accurate segmentation. This, however, may not lead to satisfactory results as actual depth data are generally noisy, which might worsen the accuracy as the networks go deeper. In this paper, we propose a unified and efficient Cross-modality Guided Encoder to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively. The key of the proposed architecture is a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Image Processing Techniques and Applications
