Object segmentation in depth maps with one user click and a   synthetically trained fully convolutional network

Matthieu Grard; Romain Br\'egier; Florian Sella; Emmanuel; Dellandr\'ea; Liming Chen

arXiv:1801.01281·cs.CV·September 25, 2018·1 cites

Object segmentation in depth maps with one user click and a synthetically trained fully convolutional network

Matthieu Grard, Romain Br\'egier, Florian Sella, Emmanuel, Dellandr\'ea, Liming Chen

PDF

Open Access

TL;DR

This paper presents a synthetic-data trained fully convolutional network for interactive object segmentation in depth maps, requiring only one user click, to aid robotic grasping of bulk objects.

Contribution

It introduces a novel edge-mask duality training approach and demonstrates effective segmentation with synthetic data, reducing the need for manual labeling.

Findings

01

Edge-mask duality improves segmentation accuracy.

02

Synthetic training data suffices for effective real-world application.

03

Outperforms patch-based networks in object segmentation tasks.

Abstract

With more and more household objects built on planned obsolescence and consumed by a fast-growing population, hazardous waste recycling has become a critical challenge. Given the large variability of household waste, current recycling platforms mostly rely on human operators to analyze the scene, typically composed of many object instances piled up in bulk. Helping them by robotizing the unitary extraction is a key challenge to speed up this tedious process. Whereas supervised deep learning has proven very efficient for such object-level scene understanding, e.g., generic object detection and segmentation in everyday scenes, it however requires large sets of per-pixel labeled images, that are hardly available for numerous application contexts, including industrial robotics. We thus propose a step towards a practical interactive application for generating an object-oriented robotic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings