Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images
Jean-Philippe Mercier, Chaitanya Mitash, Philippe Gigu\`ere and, Abdeslam Boularias

TL;DR
This paper presents a method for training an object detector and 6D pose estimator using synthetic data and minimal weakly labeled real images, employing domain adaptation to improve real-world performance for robotic manipulation tasks.
Contribution
The authors introduce a novel training pipeline combining simulation, weakly labeled real images, and adversarial domain adaptation for effective 6D pose estimation.
Findings
Significant performance improvement with minimal real data.
Effective domain adaptation reduces synthetic-real gap.
Applicable to cluttered, occluded scenes in robotics.
Abstract
This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. To minimize the human labor required for annotation, the proposed object detector is first trained in simulation by using automatically annotated synthetic images. We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects. To close the gap between real and synthetic images, we adopt a domain adaptation approach through adversarial training.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
