Semantic Pose using Deep Networks Trained on Synthetic RGB-D

Jeremie Papon; Markus Schoeler

arXiv:1508.00835·cs.CV·August 5, 2015·25 cites

Semantic Pose using Deep Networks Trained on Synthetic RGB-D

Jeremie Papon, Markus Schoeler

PDF

Open Access

TL;DR

This paper presents a deep neural network approach for indoor scene understanding from RGB-D images, capable of recognizing furniture, estimating their pose and location efficiently even in cluttered, noisy environments.

Contribution

The authors introduce a multi-output CNN trained with synthetic data for accurate, real-time furniture instance detection and pose estimation in indoor scenes.

Findings

01

Successfully annotates challenging real scenes

02

Operates in real-time on GPU

03

Performs well with limited and noisy data

Abstract

In this work we address the problem of indoor scene understanding from RGB-D images. Specifically, we propose to find instances of common furniture classes, their spatial extent, and their pose with respect to generalized class models. To accomplish this, we use a deep, wide, multi-output convolutional neural network (CNN) that predicts class, pose, and location of possible objects simultaneously. To overcome the lack of large annotated RGB-D training sets (especially those with pose), we use an on-the-fly rendering pipeline that generates realistic cluttered room scenes in parallel to training. We then perform transfer learning on the relatively small amount of publicly available annotated RGB-D data, and find that our model is able to successfully annotate even highly challenging real scenes. Importantly, our trained network is able to understand noisy and sparse observations of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Video Surveillance and Tracking Methods · Robotics and Sensor-Based Localization