Large-displacement 3D Object Tracking with Hybrid Non-local Optimization
Xuhui Tian, Xinran Lin, Fan Zhong, and Xueying Qin

TL;DR
This paper introduces a hybrid non-local optimization method for 3D object tracking that effectively handles large displacements, achieves real-time performance, and outperforms previous approaches in accuracy.
Contribution
It proposes a novel hybrid approach combining non-local and local optimizations with a contour-based precomputation for efficient 3D tracking of large displacements.
Findings
Achieves over 50 fps real-time tracking on CPU.
Significantly improves large displacement accuracy to 81.7%.
Outperforms all previous methods in both small and large displacement scenarios.
Abstract
Optimization-based 3D object tracking is known to be precise and fast, but sensitive to large inter-frame displacements. In this paper we propose a fast and effective non-local 3D tracking method. Based on the observation that erroneous local minimum are mostly due to the out-of-plane rotation, we propose a hybrid approach combining non-local and local optimizations for different parameters, resulting in efficient non-local search in the 6D pose space. In addition, a precomputed robust contour-based tracking method is proposed for the pose optimization. By using long search lines with multiple candidate correspondences, it can adapt to different frame displacements without the need of coarse-to-fine search. After the pre-computation, pose updates can be conducted very fast, enabling the non-local optimization to run in real time. Our method outperforms all previous methods for both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Robotics and Sensor-Based Localization
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
