Multi-Task Deep Networks for Depth-Based 6D Object Pose and Joint Registration in Crowd Scenarios
Juil Sock, Kwang In Kim, Caner Sahin, Tae-Kyun Kim

TL;DR
This paper introduces a multi-task deep learning approach for accurately estimating 6D object poses and performing joint registration in cluttered bin-picking scenarios with occlusion and similar distractors.
Contribution
It presents a novel multi-task neural network architecture that jointly detects, estimates depth and 3D pose, and registers multiple objects, handling occlusions and clutter effectively.
Findings
Outperforms state-of-the-art methods by 15-31% in average precision.
Effectively handles occlusion and clutter in bin-picking scenarios.
Demonstrates robustness on both synthetic and real datasets.
Abstract
In bin-picking scenarios, multiple instances of an object of interest are stacked in a pile randomly, and hence, the instances are inherently subjected to the challenges: severe occlusion, clutter, and similar-looking distractors. Most existing methods are, however, for single isolated object instances, while some recent methods tackle crowd scenarios as post-refinement which accounts multiple object relations. In this paper, we address recovering 6D poses of multiple instances in bin-picking scenarios in depth modality by multi-task learning in deep neural networks. Our architecture jointly learns multiple sub-tasks: 2D detection, depth, and 3D pose estimation of individual objects; and joint registration of multiple objects. For training data generation, depth images of physically plausible object pose configurations are generated by a 3D object model in a physics simulation, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Advanced Neural Network Applications · Robotics and Sensor-Based Localization
