Leveraging Positional Encoding for Robust Multi-Reference-Based Object 6D Pose Estimation
Jaewoo Park, Jaeguk Kim, and Nam Ik Cho

TL;DR
This paper introduces a novel multi-reference-based approach for 6D object pose estimation that employs positional encoding and a robust refinement strategy, significantly improving accuracy on standard datasets.
Contribution
The paper proposes new strategies using positional encoding and a normalized multi-reference refinement to overcome limitations of existing pose estimation methods.
Findings
Outperforms existing methods on Linemod, Linemod-Occlusion, and YCB-Video datasets.
Uses high-frequency positional encoding for better geometric representation.
Employs adaptive normalization and occlusion augmentation to enhance model focus.
Abstract
Accurately estimating the pose of an object is a crucial task in computer vision and robotics. There are two main deep learning approaches for this: geometric representation regression and iterative refinement. However, these methods have some limitations that reduce their effectiveness. In this paper, we analyze these limitations and propose new strategies to overcome them. To tackle the issue of blurry geometric representation, we use positional encoding with high-frequency components for the object's 3D coordinates. To address the local minimum problem in refinement methods, we introduce a normalized image plane-based multi-reference refinement strategy that's independent of intrinsic matrix constraints. Lastly, we utilize adaptive instance normalization and a simple occlusion augmentation method to help our model concentrate on the target object. Our experiments on Linemod,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · Hand Gesture Recognition Systems
MethodsAdaptive Instance Normalization · Instance Normalization
