Spatial Feature Mapping for 6DoF Object Pose Estimation
Jianhan Mei, Xudong Jiang, Henghui Ding

TL;DR
This paper introduces a novel approach for 6DoF object pose estimation in cluttered environments by leveraging spatial structure through graph-based modeling and spherical convolutions, improving accuracy under occlusion.
Contribution
It proposes a graph-based spatial feature mapping method combined with spherical convolution to enhance pose estimation accuracy in challenging scenarios.
Findings
Effective in cluttered backgrounds and occlusion
Improves pose accuracy on YCB-Video and LINEMOD datasets
Utilizes both depth and RGB information for robust estimation
Abstract
This work aims to estimate 6Dof (6D) object pose in background clutter. Considering the strong occlusion and background noise, we propose to utilize the spatial structure for better tackling this challenging task. Observing that the 3D mesh can be naturally abstracted by a graph, we build the graph using 3D points as vertices and mesh connections as edges. We construct the corresponding mapping from 2D image features to 3D points for filling the graph and fusion of the 2D and 3D features. Afterward, a Graph Convolutional Network (GCN) is applied to help the feature exchange among objects' points in 3D space. To address the problem of rotation symmetry ambiguity for objects, a spherical convolution is utilized and the spherical features are combined with the convolutional features that are mapped to the graph. Predefined 3D keypoints are voted and the 6DoF pose is obtained via the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Human Pose and Action Recognition
MethodsConvolution
