Fast 3D Pose Refinement with RGB Images
Abhinav Jain, Frank Dellaert

TL;DR
This paper introduces a CNN-based system that refines coarse 3D pose estimates from simpler algorithms using RGB images, achieving high precision with minimal training data, suitable for resource-constrained robots.
Contribution
The paper presents a novel, efficient 3D pose refinement method that improves accuracy using RGB images and minimal training, reducing reliance on complex neural networks.
Findings
Refines coarse 3D poses to high precision
Operates with minimal training data
Suitable for resource-limited robotic systems
Abstract
Pose estimation is a vital step in many robotics and perception tasks such as robotic manipulation, autonomous vehicle navigation, etc. Current state-of-the-art pose estimation methods rely on deep neural networks with complicated structures and long inference times. While highly robust, they require computing power often unavailable on mobile robots. We propose a CNN-based pose refinement system which takes a coarsely estimated 3D pose from a computationally cheaper algorithm along with a bounding box image of the object, and returns a highly refined pose. Our experiments on the YCB-Video dataset show that our system can refine 3D poses to an extremely high precision with minimal training data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Robotics and Sensor-Based Localization
