Multi-View Deep Learning for Consistent Semantic Mapping with RGB-D   Cameras

Lingni Ma; J\"org St\"uckler; Christian Kerl; Daniel Cremers

arXiv:1703.08866·cs.CV·December 6, 2017·19 cites

Multi-View Deep Learning for Consistent Semantic Mapping with RGB-D Cameras

Lingni Ma, J\"org St\"uckler, Christian Kerl, Daniel Cremers

PDF

Open Access

TL;DR

This paper introduces a multi-view deep learning approach for consistent semantic segmentation using RGB-D cameras, improving accuracy by fusing information from multiple viewpoints during training and testing.

Contribution

It presents a novel multi-view training framework that enhances semantic segmentation consistency and accuracy over single-view methods using RGB-D data.

Findings

01

Multi-view training improves segmentation accuracy.

02

Fusion of multiple views outperforms single-view baselines.

03

Achieves state-of-the-art results on NYUDv2 dataset.

Abstract

Visual scene understanding is an important capability that enables robots to purposefully act in their environment. In this paper, we propose a novel approach to object-class segmentation from multiple RGB-D views using deep learning. We train a deep neural network to predict object-class semantics that is consistent from several view points in a semi-supervised way. At test time, the semantics predictions of our network can be fused more consistently in semantic keyframe maps than predictions of a network trained on individual views. We base our network architecture on a recent single-view deep learning approach to RGB and depth fusion for semantic object-class segmentation and enhance it with multi-scale loss minimization. We obtain the camera trajectory using RGB-D SLAM and warp the predictions of RGB-D images into ground-truth annotated frames in order to enforce multi-view…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Advanced Vision and Imaging