RedNet: Residual Encoder-Decoder Network for indoor RGB-D Semantic   Segmentation

Jindong Jiang; Lunan Zheng; Fei Luo; and Zhijun Zhang

arXiv:1806.01054·cs.CV·August 7, 2018·181 cites

RedNet: Residual Encoder-Decoder Network for indoor RGB-D Semantic Segmentation

Jindong Jiang, Lunan Zheng, Fei Luo, and Zhijun Zhang

PDF

Open Access 5 Repos

TL;DR

RedNet is a novel residual encoder-decoder network that effectively combines RGB and depth data for indoor semantic segmentation, achieving state-of-the-art accuracy on benchmark datasets.

Contribution

The paper introduces RedNet, a residual encoder-decoder architecture with a fusion structure and pyramid supervision for improved indoor RGB-D semantic segmentation.

Findings

01

Achieves 47.8% mIoU on SUN RGB-D dataset.

02

Utilizes residual modules in encoder and decoder for better feature learning.

03

Employs pyramid supervision to enhance training efficiency.

Abstract

Indoor semantic segmentation has always been a difficult task in computer vision. In this paper, we propose an RGB-D residual encoder-decoder architecture, named RedNet, for indoor RGB-D semantic segmentation. In RedNet, the residual module is applied to both the encoder and decoder as the basic building block, and the skip-connection is used to bypass the spatial feature between the encoder and decoder. In order to incorporate the depth information of the scene, a fusion structure is constructed, which makes inference on RGB image and depth image separately, and fuses their features over several layers. In order to efficiently optimize the network's parameters, we propose a `pyramid supervision' training scheme, which applies supervised learning over different layers in the decoder, to cope with the problem of gradients vanishing. Experiment results show that the proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRemote Sensing and LiDAR Applications · Advanced Neural Network Applications · Video Surveillance and Tracking Methods