Object segmentation in depth maps with one user click and a synthetically trained fully convolutional network
Matthieu Grard, Romain Br\'egier, Florian Sella, Emmanuel, Dellandr\'ea, Liming Chen

TL;DR
This paper presents a synthetic-data trained fully convolutional network for interactive object segmentation in depth maps, requiring only one user click, to aid robotic grasping of bulk objects.
Contribution
It introduces a novel edge-mask duality training approach and demonstrates effective segmentation with synthetic data, reducing the need for manual labeling.
Findings
Edge-mask duality improves segmentation accuracy.
Synthetic training data suffices for effective real-world application.
Outperforms patch-based networks in object segmentation tasks.
Abstract
With more and more household objects built on planned obsolescence and consumed by a fast-growing population, hazardous waste recycling has become a critical challenge. Given the large variability of household waste, current recycling platforms mostly rely on human operators to analyze the scene, typically composed of many object instances piled up in bulk. Helping them by robotizing the unitary extraction is a key challenge to speed up this tedious process. Whereas supervised deep learning has proven very efficient for such object-level scene understanding, e.g., generic object detection and segmentation in everyday scenes, it however requires large sets of per-pixel labeled images, that are hardly available for numerous application contexts, including industrial robotics. We thus propose a step towards a practical interactive application for generating an object-oriented robotic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
