Active 6D Multi-Object Pose Estimation in Cluttered Scenarios with Deep Reinforcement Learning
Juil Sock, Guillermo Garcia-Hernando, Tae-Kyun Kim

TL;DR
This paper introduces a reinforcement learning-based method for strategic camera movement to improve 6D multi-object pose estimation in cluttered environments, optimizing accuracy within real-world constraints without relying on ground-truth during inference.
Contribution
It presents a novel framework that trains an agent in simulation to select camera movements for better pose estimation, avoiding the need for viewpoint rendering during actual deployment.
Findings
Successfully estimates 6D object poses in cluttered synthetic and real scenarios.
Outperforms strong baseline methods in pose estimation accuracy.
Learns effective camera movement strategies using reinforcement learning.
Abstract
In this work, we explore how a strategic selection of camera movements can facilitate the task of 6D multi-object pose estimation in cluttered scenarios while respecting real-world constraints important in robotics and augmented reality applications, such as time and distance traveled. In the proposed framework, a set of multiple object hypotheses is given to an agent, which is inferred by an object pose estimator and subsequently spatio-temporally selected by a fusion function that makes use of a verification score that circumvents the need of ground-truth annotations. The agent reasons about these hypotheses, directing its attention to the object which it is most uncertain about, moving the camera towards such an object. Unlike previous works that propose short-sighted policies, our agent is trained in simulated scenarios using reinforcement learning, attempting to learn the camera…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
