GSGTrack: Gaussian Splatting-Guided Object Pose Tracking from RGB Videos
Zhiyuan Chen, Fan Lu, Guo Yu, Bin Li, Sanqing Qu, Yuan Huang,, Changhong Fu, Guang Chen

TL;DR
GSGTrack is a novel RGB-based object pose tracking method that jointly optimizes geometry and pose using Gaussian Splatting, improving robustness against noisy data and depth inaccuracies in monocular videos.
Contribution
It introduces a joint optimization framework with Gaussian Splatting and a graph-based geometry refinement, along with novel loss and selection strategies for robust pose tracking.
Findings
Effective in 6DoF pose tracking on OnePose and HO3D datasets
Outperforms existing RGB-based methods in accuracy and robustness
Enhances object reconstruction quality in monocular videos
Abstract
Tracking the 6DoF pose of unknown objects in monocular RGB video sequences is crucial for robotic manipulation. However, existing approaches typically rely on accurate depth information, which is non-trivial to obtain in real-world scenarios. Although depth estimation algorithms can be employed, geometric inaccuracy can lead to failures in RGBD-based pose tracking methods. To address this challenge, we introduce GSGTrack, a novel RGB-based pose tracking framework that jointly optimizes geometry and pose. Specifically, we adopt 3D Gaussian Splatting to create an optimizable 3D representation, which is learned simultaneously with a graph-based geometry optimization to capture the object's appearance features and refine its geometry. However, the joint optimization process is susceptible to perturbations from noisy pose and geometry data. Thus, we propose an object silhouette loss to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Human Pose and Action Recognition
MethodsADaptive gradient method with the OPTimal convergence rate
