GS2Pose: Two-stage 6D Object Pose Estimation Guided by Gaussian Splatting
Jilan Mei, Junbo Li, Cai Meng

TL;DR
GS2Pose introduces a two-stage 6D object pose estimation method that leverages 3D Gaussian splatting and a differentiable refinement process, achieving high accuracy without requiring detailed CAD models.
Contribution
The paper presents a novel two-stage framework combining coarse NOCS-based estimation with a differentiable refinement using Gaussian splatting, improving robustness and flexibility.
Findings
Achieves competitive results on the LineMod dataset.
Effectively handles occlusion and lighting variations.
Does not require high-quality CAD models for training.
Abstract
This paper proposes a new method for accurate and robust 6D pose estimation of novel objects, named GS2Pose. By introducing 3D Gaussian splatting, GS2Pose can utilize the reconstruction results without requiring a high-quality CAD model, which means it only requires segmented RGBD images as input. Specifically, GS2Pose employs a two-stage structure consisting of coarse estimation followed by refined estimation. In the coarse stage, a lightweight U-Net network with a polarization attention mechanism, called Pose-Net, is designed. By using the 3DGS model for supervised training, Pose-Net can generate NOCS images to compute a coarse pose. In the refinement stage, GS2Pose formulates a pose regression algorithm following the idea of reprojection or Bundle Adjustment (BA), referred to as GS-Refiner. By leveraging Lie algebra to extend 3DGS, GS-Refiner obtains a pose-differentiable rendering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Object Detection Techniques · Robot Manipulation and Learning · Hand Gesture Recognition Systems
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Softmax · Attention Is All You Need · Convolution · Concatenated Skip Connection · Max Pooling · U-Net
