RGB-D Object Detection and Semantic Segmentation for Autonomous   Manipulation in Clutter

Max Schwarz; Anton Milan; Arul Selvam Periyasamy; Sven Behnke

arXiv:1810.00818·cs.CV·October 3, 2018

RGB-D Object Detection and Semantic Segmentation for Autonomous Manipulation in Clutter

Max Schwarz, Anton Milan, Arul Selvam Periyasamy, Sven Behnke

PDF

TL;DR

This paper presents a deep-learning approach combining object detection and semantic segmentation using RGB-D data for improved autonomous manipulation in cluttered environments, validated on challenging datasets including Amazon Picking Challenge and disaster scenarios.

Contribution

The paper introduces a novel RGB-D perception method that fuses depth information and leverages pretrained features to enhance object detection and segmentation in cluttered scenes.

Findings

01

Achieved high accuracy in object detection and segmentation in cluttered scenes.

02

Demonstrated robustness on Amazon Picking Challenge and disaster-response datasets.

03

Combined perception methods improve reliability of robotic manipulation.

Abstract

Autonomous robotic manipulation in clutter is challenging. A large variety of objects must be perceived in complex scenes, where they are partially occluded and embedded among many distractors, often in restricted spaces. To tackle these challenges, we developed a deep-learning approach that combines object detection and semantic segmentation. The manipulation scenes are captured with RGB-D cameras, for which we developed a depth fusion method. Employing pretrained features makes learning from small annotated robotic data sets possible. We evaluate our approach on two challenging data sets: one captured for the Amazon Picking Challenge 2016, where our team NimbRo came in second in the Stowing and third in the Picking task, and one captured in disaster-response scenarios. The experiments show that object detection and semantic segmentation complement each other and can be combined to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.