Deep Learning of Local RGB-D Patches for 3D Object Detection and 6D Pose   Estimation

Wadim Kehl; Fausto Milletari; Federico Tombari; Slobodan Ilic; Nassir; Navab

arXiv:1607.06038·cs.CV·July 21, 2016·30 cites

Deep Learning of Local RGB-D Patches for 3D Object Detection and 6D Pose Estimation

Wadim Kehl, Fausto Milletari, Federico Tombari, Slobodan Ilic, Nassir, Navab

PDF

Open Access

TL;DR

This paper introduces a 3D object detection method using deep learning to regress descriptors of local RGB-D patches, enabling accurate 6D pose estimation and robust detection across diverse datasets.

Contribution

The paper proposes a novel approach employing a convolutional auto-encoder for local patch descriptor regression, improving 6D pose estimation and scalability over previous methods.

Findings

01

Outperforms state-of-the-art methods in 3D object detection

02

Demonstrates strong generalization to unseen data

03

Provides scalable detection for multiple objects

Abstract

We present a 3D object detection method that uses regressed descriptors of locally-sampled RGB-D patches for 6D vote casting. For regression, we employ a convolutional auto-encoder that has been trained on a large collection of random local patches. During testing, scene patch descriptors are matched against a database of synthetic model view patches and cast 6D object votes which are subsequently filtered to refined hypotheses. We evaluate on three datasets to show that our method generalizes well to previously unseen input data, delivers robust detection results that compete with and surpass the state-of-the-art while being scalable in the number of objects.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques